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RESEARCH  INTO  THE  USE  OF  SPEECH  RECOGNITION  ENHANCED  MICRO  WORLDS 
IN  AN  AUTHORABLE  LANGUAGE  TUTOR 

EXECUTIVE  SUMMARY 


Research  Requirement: 

Micro  Analysis  and  Design,  Inc.  and  the  University  of  Maryland  developed  the  Military 
Language  Tutor  (MILT)  under  an  earlier  contract  (MDA903-92-C-0229).  The  U.S.  Army 
Research  Institute  was  the  primary  sponsor  of  the  MILT  project.  MILT  was  also  sponsored 
through  ARI  by  the  Office  of  Special  Technology  of  DoD,  and  the  Defense  Advanced  Research 
Projects  Agency  (DARPA).  Initially  the  MILT  project  was  designed  to  investigate  the 
possibility  of  using  natural  language  processing  (NLP)  software  to  identify  semantic  and 
syntactic  errors  and  provide  the  basis  for  state  of  the  art  dialogue  exercises. 

A  graphical  flowchart  approach  to  designing  language  lessons  and  authoring  templates 
for  thirteen  different  exercise  types  were  developed  for  the  MILT  project.  One  of  the  exercise 
types  developed  was  the  microworld  exercise.  A  microworld  is  a  software  environment  in  which 
students  can  issue  commands  that  are  executed  by  animation  routines  in  a  game  like  atmosphere. 

Once  the  first  microworld  exercise  was  completed  and  integrated  into  MILT,  ARI  funded  the 
investigation  of  the  use  of  discreet  speech  recognition  technology  in  language  learning  using  the 
microworld  exercise  as  a  basis.  In  the  speech  recognition  microworld,  students  can  issue  spoken  commands 
in  Arabic  by  selecting  among  three  displayed  alternatives.  The  sequence  of  commands  can  be  changed  by 
the  author.  However,  new  commands  cannot  be  easily  added  to  the  exercise. 

The  goal  of  this  current  effort  was  to  expand  the  capabilities  of  MILT  and  incorporate 
continuous  speech  recognition  for  Arabic,  Spanish  and  English.  The  resulting  tool  includes  continuous 
speech  recognition  technology  and  significant  improvements  to  the  microworld  exercise. 

Procedure: 

The  overall  objective  of  this  project  was  to  develop  a  general  purpose,  authorable, 
microworld  that  utilizes  continuous  speech  recognition.  The  central  tasks  were  1)  the  design  of 
an  enhanced  microworld  exercise,  2)  development  of  continuous  speech  recognition  components 
for  English,  Arabic,  and  Spanish,  3)  incorporation  of  speech  recognition  into  the  micro  world 
exercise,  and  4)  expansion  of  the  Arabic  natural  language  processing  (NLP)  system. 

Findings: 

The  effort  resulted  in  the  development  of  an  authorable,  speech  recognition  enhanced 
three  dimensional  microworld  exercise  and  improved  32  bit  MILT  application. 

Utilization  of  Findings: 

The  effort  provides  authoring  tool  and  tutoring  system  that  can  be  used  to  teach  mission 
relevant  information  and  language  skills  though  the  use  of  a  three-dimensional  environment  and 
continuous  speech  recognition. 
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Overview 


This  report  documents  the  development  of  an  authorable  tutoring  system.  Specifically, 
the  focus  was  on  the  development  of  an  authorable,  three-dimensional,  speech  recognition 
enhanced  microworld  exercise  capable  of  supporting  military  and  language  informational  needs. 

As  part  of  this  effort  the  following  major  tasks  were  performed: 

•  an  enhanced  microworld  exercise  was  designed  and  developed 

•  continuous  speech  recognition  components  for  English,  Arabic,  and  Spanish  were  developed 

•  speech  recognition  was  incorporated  into  the  3D  microworld  exercise 

•  the  Arabic  natural  language  processing  (NLP)  system  was  expanded 

The  effort  was  based  on  expansion  of  the  Army  Research  Institute  sponsored  MILT  1 .0 

project. 

Results  of  the  Work 

In  the  contract,  the  overall  objective  was  to  develop  a  general  purpose,  authorable, 
enhanced  microworld  exercise  that  includes  continuous  speech  recognition.  The  effort  resulted  in 
the  successful  completion  of  fourteen  tasks. 

•  Task  1 :  Development  of  a  Work  Plan  describing  the  steps  to  be  taken  and  the  projected  time 
for  completion 

•  Task  2:  Acquisition  of  an  appropriate  continuous  speech  recognition  system  and  any  required 
expertise 

•  Task  3:  Engage  in  on-site  evaluation  of  discrete  speech  recognition  version  with  SOF  (Ft. 
Campbell,  KY) 

•  Task  4:  Develop  a  design  for  using  speech  recognition  in  the  tutor 

•  Task  5:  After  review  by  ARI,  personnel,  revise  the  design 

•  Task  6:  Implement  the  design 

•  Task  7:  Develop  Arabic  CSR  models 

•  Task  8:  Develop  Arabic,  Spanish  and  English  language  models 

•  Task  9:  Expand  the  Arabic  NLP  System 

•  Task  10:  Deliver  the  software 

•  Task  1 1 :  Develop  an  Arabic  continuous  speech  recognition  driven  microworld  exercise 

•  Task  12:  Prepare  system  documentation  and  user  help 

•  Task  13:  Complete  monthly  progress  reports 

•  Task  14:  Complete  final  report 

These  tasks  are  discussed  in  this  section. 
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Task  1:  Development  of  Work  Plan 


During  this  first  task  detailed  estimates  for  each  of  the  remaining  tasks  was  made. 


Convert  code  to  32brt 


Investigate  3D  Packages 
Prepare  design  document 
Evaluation  At  Ft  Campbell 
Incorporate  New  Splash  Screen 
Incorporate  New  Icons 
Revise  Edit  Controls 
Define  New  Scene  Capability 
Develop  Scene  Types 
Edit  Scene  Interface 
Develop  Multiple  Scene  Editor 
Object  Authoring 
Develop  3D  Objects 
Add  Object  Capability 
Move  Object  Capability 

Delete  Object  Capability . 

Object  Attributes 

Define  Sign  Object  Attribute 
Define  Color  Object  Attribute 
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Define  Graphic  Object  Attribute 
Define  Sound  Object  Attribute 
Define  Video  Object  Attribute 
Define  Combination  Object  Attribute 
Define  Connect  Object  Attribute 
Define  Container  Object  Attribute 
Define  Hang  Object  Attribute 
Define  Light  Object  Attribute 
Define  Lock  Object  Attribute 
Define  Open  Object  Attribute 
Define  Random  Object  Attribute 
Define  Rotate  Object  Attribute 
Define  Size  Object  Attribute 
Define  Text  Object  Attribute 
Define  Unlock  Object  Attribute 
Gravity  Algorithm 

Define  Label  Sound  Object  Attribute 
Develop  texture  map  capability 


Collision  Algorithm 
Synonym  Capability 

Develop  Object  Synonym  Addition 
Develop  Sentence  Synonym  Ability 
Improve  String  matcher 
Develop  Student  Interface 
Integrate  New  Actions/Objects/Synonyms 
Multiple  Similar  Objects 
Implement  Object  Labels 
Implement  Inventory  Ability 
Develop  Arabic  CSR  Models 
Translate  Commands  into  Arabic  &  Spanish 
Develop  Language  Models 
Integrate  Continuous  Speech  Recognition 
Save  Microworld  Progress 
Update  Documentation 
Improve  Line  Drawing 


Figure  1  -Work  Plan  GANTT  CHART 


2 


The  estimate  included  time  requirements,  resource  requirements,  personnel  requirements 
and  critical  subtask  identification.  A  GANTT  chart  (Figure  1)  was  developed  that  shows  each 
task  in  relation  to  the  project  schedule. 

Task  2:  Acquisition  of  an  appropriate  continuous  speech  recognition 

Micro  Analysis  and  Design  evaluated  several  different  CSR  programs  that  were  available 
in  the  first  quarter  of  1997.  The  commercial  vendors  considered  were: 

•  Dragon  Speech  Systems 

•  Entropies  Cambridge  Research  Laboratory 

•  Lemout  &  Hauspie 

•  Speech  Systems 

•  IBM 

It  was  decided  that  the  Entropies  HTK  system  would  be  the  most  desirable  to  integrate  into 
MILT  for  the  following  reasons: 

1 .  the  accuracy  of  the  CSR  was  rated  highly 

2.  models  had  been  built  in  languages  other  than  English  using  the  tool 

3.  it  was  fairly  easy  to  develop  language  models  in  languages  other  than  English 

4.  a  development  kit  was  available  for  incorporating  the  CSR  into  applications 

5.  other  government  agencies  were  using  the  system  in  products  that  it  might  be  desirable  to 
link  to  in  the  future 

6.  the  software  was  executable  on  a  PC  Windows  platform 

7.  the  software  was  speaker  independent 

8.  the  licensing  requirements  were  affordable 

9.  the  CSR  engine  uses  a  standard  PC  sound  card. 

A  Windows  version  of  the  Entropies  HTK  CSR  software  was  purchased  for  use  in  this 
program. 

The  Army  Research  Institute  and  Micro  Analysis  and  Design  obtained  CSR  expertise  from 
the  U.S.  Military  Academy  (USMA)  for  the  translation  of  the  Arabic  utterances  and  development 
of  the  CSR  acoustic  and  language  models.  The  USMA  has  unique  technical  expertise  in  the 
development  of  Arabic  Language  Modules,  including  translation  experience. 

Task  3:  Engage  in  on-site  evaluation  of  discrete  speech  recognition  version 

Micro  Analysis  and  Design  and  ARI  prepared  for  and  performed  an  on-site  evaluation  of 
the  two-dimensional  discrete  speech  interactive  language  tutor  in  MILT  version  1 .0.  The 
evaluation  was  performed  in  June  1997  with  Special  Forces  personnel  at  Ft.  Campbell,  KY. 

Preparation  for  the  evaluation  involved: 
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•  development  of  2  Arabic  speech  interactive  lessons  (1  for  lower  level  students  and  1  for  more 
advanced  students) 

•  development  of  background  questionnaire 

•  pre-test  and  post-test  development 

•  development  of  microworld  questionnaire 

•  development  of  survey  questionnaire 

•  performance  of  sample  evaluations  by  Arabic  students  in  the  DC  area 

•  preparation  of  the  equipment  (computers,  tape  recorders,  microphones,  CDs,  tapes,  etc.) 

At  Ft.  Campbell,  the  evaluation  procedure  followed  was: 

1 .  Briefly  explain  the  purpose  of  the  evaluation,  including  the  unique  functional  capability  of 
MILT  as  a  tutoring  system  and  the  importance  of  his  or  her  inputs,  to  the  subject. 

2.  Subject  completed  the  background  questionnaire. 

3.  Conducted  the  pre-test:  (a)  gave  the  subject  the  microphone  attached  to  the  tape  recorder  and 
had  him  or  her  test  its  recording;  (b)  gave  him  or  her  an  English  version  of  the  70  utterances; 
(c)  had  him  record  his  or  her  name,  the  date,  and  a  word  "pre-test";  (d)  told  him  to  speak  each 
of  the  70  English  utterances  in  Arabic  to  record  on  the  tape.  Made  sure  that  he  or  she  was 
speaking  each  utterance  in  order;  (e)  after  completing  the  recording,  took  out  the  audio  tape 
and  wrote  the  subject's  name,  date,  and  "pre-test"  on  the  tape. 

4.  Told  the  subject  to  begin  the  listening  and  oral  practice  section.  Started  the  subject  on  the 
MILT  pronunciation  exercise. 

5.  After  completing  the  listening  and  oral  practice  section,  gave  the  survey  questionnaire 
prepared  for  that  section. 

6.  As  soon  as  the  subject  completed  the  questionnaire,  told  him  or  her  to  continue  the  next 
section  of  the  tutor:  oral  production  exercise  in  the  micro-world. 

7.  When  the  subject  completed  the  tutor,  gave  the  questionnaire  prepared  for  the  micro- world 
section. 

8.  After  completing  the  questionnaire,  conducted  the  post-test:  (a)  gave  the  subject  the 
microphone  attached  to  the  tape  recorder  and  test  its  recording;  (b)  gave  him  or  her  the  same 
English  version  of  the  70  utterances  used  for  the  pre-test;  (c)  had  him  record  his  or  her  name, 
the  date,  and  a  word  "post-test"  on  the  tape;  (d)  told  him  to  speak  each  of  the  70  English 
utterances  in  Arabic  to  record  on  the  tape.  After  completing  the  recording,  took  out  the  audio 
tape  and  wrote  the  subject's  name,  date,  and  "post-test"  on  the  tape. 

9.  After  completing  the  evaluation,  the  subject  was  thanked  for  his  or  her  participation. 

Twenty  one  subjects  performed  the  evaluation. 

The  questionnaire  and  audio  tapes  were  used  to  assess  the  usability  of  MILT  from  the 

subjective  student  assessments  and  from  proficiency  ratings  of  the  Arabic  speaking  on  the  audio 

tapes. 
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The  recorded  audio  tapes  were  delivered  to  a  native  Arabic  speaking  linguist  with  copies 
of  the  assessment  form  prepared  for  rating  the  proficiency  level  of  Arabic  usage. 

The  Arabic-speaking  linguist: 

(a)  rated  each  of  the  four  dimensions  of  Arabic  (Vocabulary,  Grammar,  Pronunciation,  Fluency) 
usage  using  a  5-point  scale,  "0"  for  "not  being  able  to  speak  at  all"  and  "5"  for  "excellent  - 
perfect  or  near  perfect"; 

(b)  listened  to  at  least  a  quarter  of  each  tape  before  rating  the  subject's  proficiency  level  on  the 
assessment  form; 

(c)  listened  to  the  first  utterance  recorded  for  the  pre-test  and  listen  to  the  same  utterance 
recorded  for  the  post-test  before  rating  them; 

(d)  listened  to  the  first  utterance  recorded  for  the  pre-test  again  and  rated  each  of  the  four 
dimensions,  and  then  listened  to  the  same  utterance  recorded  for  the  post-test  again  and  rated 
it  on  each  of  the  four  dimensions;  continued  this  process  until  all  utterances  were  rated. 

All  four  dimensions  Vocabulary,  Grammar,  Pronunciation,  and  Fluency  showed 
improvement  in  the  post-test  compared  with  the  pre-test. 

Task  4:  Develop  design 

This  task  involved  writing  a  design  document  that  detailed  the  design  for  expanding  the 
capabilities  of  MILT  and  adding  continuous  speech  recognition.  A  document  entitled  “MILT 
2.0  Design  Document”  was  prepared  and  delivered  to  ARI  in  May  1997. 

The  “MILT  2.0  Design  Document”  described  architectural  changes  (port  to  32bit,  and  use 
of  OpenGL  3D  environment),  significant  microworld  changes,  the  microworld  authoring  design, 
the  design  of  the  student  perspective,  and  the  designed  use  of  speech  recognition. 

Task  5:  After  review  by  ARI.  personnel,  revise  the  design 

ARI  personnel  reviewed  the  “MILT  2.0  Deign  Document”  and  prepared  a  list  of  several 
comments  and  questions.  MA&D  personnel  then  revised  the  design  document  and  submitted  a 
revised  document  in  June  1997. 

Task  6:  Implement  the  design 

A  significant  amount  of  the  effort  on  this  contract  was  dedicated  to  implementing  the  design. 
This  section  describes  the  developed  software. 

32-bit  Environment 

MILT  1.x  was  a  Windows  3.x  compliant  (16-bit)  application.  This  allowed  for  its  use  on 
many  common  16-bit  systems.  However,  the  16-bit  versions  of  windows  lacked  robustness  and 
effective  multitasking.  The  current  family  of  32-bit  operating  systems  offered  by  Microsoft 
(Windows  95/98  and  Windows  NT  4.0)  provide  developers  with  a  much  more  robust  and  ‘user- 
friendly’  operating  environment.  To  take  advantage  of  the  32-bit  environment,  the  entire  MILT 
application  was  reprogrammed  using  Microsoft  Visual  C++. 
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3D  Graphic  Environment 

The  microworld  based  lesson  types  in  MILT  1.x  were  standard  2D  graphics.  With  the 
expansion  of  MILT  2.0  into  the  32-bit  arena,  3D  graphics  APIs  become  available.  The  most 
industry  tested  API  was  Silicon  Graphics'  OpenGL.  OpenGL  has  the  ability  to  be  accelerated  by 
video  cards  that  understand  it  to  provide  much  faster  rates  of  operation,  but  can  also  be  handled 
completely  by  software. 

There  were  other  APIs  such  as  Direct3D  that  were  currently  available  for  use  in 
developing  3D  graphics.  MA&D  conducted  an  investigation  into  which  API  would  be  serve  the 
MILT  purposes  best.  OpenGL  was  chosen  as  the  API  to  use  in  development  of  MILT  2.0  due  to 
the  fact  that  it  is  widely  supported  by  graphic  application  and  hardware  vendors  and  because 
other  ARI  programs  were  using  OpenGL. 

The  MILT  3D  Microworld  was  developed  using  an  OpenGL  development  environment 
called  Open  Inventor.  This  software  is  "host  ID  locked."  Host  ID  locking  is  a  form  of  software 
license  control  that  allows  Open  Inventor  to  run  only  on  specifically  licensed  systems.  Each  time 
Open  Inventor  runs,  it  checks  for  a  valid  password. 

Open  Inventor  uses  both  VRML  (Virtual  Reality  Modeling  Language)  and  Open  Inventor 
objects.  VRML  is  the  file  format  adopted  by  the  Internet  community  for  3D  geometry  data  on 
the  World  Wide  Web 

To  execute  the  MILT  3D  software,  users  will  need  to  obtain  a  runtime  license  for  the  host 
computer  from  Template  Graphics  Software  (TGS).  The  web  page  for  TGS  is  www.tgs.com. 

Significant  Microworld  Changes 

Microworld  Overview 

The  microworld  exercise  in  both  versions  1.x  and  2.0  is  an  authorable  game-like  exercise 
developed  to  complement  the  standard  question  types  like  multiple  choice,  question  and  answer. 
The  authorability  component  of  the  microworld  exercise  makes  it  both  complex  and  reusable.  In 
traditional  graphic  games,  once  the  user  has  gone  through  the  game  its  utility  has  greatly 
diminished  because  the  user  already  knows  what  objects  are  in  each  location  and  what  the  steps 
are  to  complete  the  game.  The  MILT  microworld  exercise  can  be  used  to  develop  virtually 
infinite  numbers  of  different  exercises.  The  author  can  change  the  question  to  be  answered  (the 
goal  of  the  exercise),  the  location  of  objects,  the  attributes  of  objects,  the  background  (texture)  of 
each  scene,  the  number  of  scenes  available,  and  the  answer  feedback  received  (both  incorrect  and 
correct). 

In  the  microworld  exercise,  the  author  develops  a  question  for  the  student  to  answer.  The 
student  must  find  the  answer  by  manipulating  objects  in  one  or  more  scenes.  There  are  input 
modes  for  the  micro  world  exercise:  1  mode  requires  the  student  to  textually  produce  commands 
by  typing  on  the  keyboard,  the  second  mode  uses  speech  recognition  technology  to  allow 
students  to  issue  spoken  commands  to  the  MILT  microworld. 

This  section  points  out  the  major  changes  that  were  made  to  the  MILT  micro  world 
exercise  for  Version  2.0.  Each  of  these  changes  is  discussed  in  detail  in  the  Authoring  and/or 
Student  Perspective  sections. 
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Attributes  and  Obiects- 

Version  2.0  includes  many  more  objects  and  attributes  than  Version  1.x.  In  addition,  all  of 
the  low  quality,  bitmap  based,  2D  object  graphics  used  in  1.x  were  completely  redone  using 
OpenGL  to  make  much  more  realistic  3D  graphic  representations. 

Actions 

Actions  are  essentially  verbs  that  the  animated  microworld  “understands”  and  can 
perform.  The  action  list  for  version  2.0  was  significantly  expanded  from  Version  1.x. 

Labels 

Students  now  have  the  ability  to  gain  information  about  objects  by  looking  at  that 
object’s  “label”.  The  label  displays  the  name  of  the  object  and  gives  the  student  information 
about  what  actions  can  be  performed  on  it  (e.g.  the  book  can  be  read,  opened,  closed,  moved, 
etc.)  Labels  are  displayed  when  the  user  selects  an  object  with  the  mouse.  The  original  screen 
text  appears  in  English.  Students  are  able  to  click  a  button  to  access  the  same  information  in  the 
current  foreign  language.  Authors  can  add  synonyms  that  will  be  displayed  to  the  student  as 
object  names.  Authors  cannot  change  the  actions  that  can  be  performed  on  an  object.  Authors 
also  have  the  option  of  recording  one  WAV  sound  file  that  the  student  can  play  from  the  Label 
screen. 


First  Person 

The  perspective  in  MILT  1.x  is  third  person.  It  was  decided  to  develop  MILT  2.0  in  first 
person  so  that  the  user  can  really  feel  like  he/she  is  there. 

Multiple  Similar  Objects 

Through  the  use  of  complex  algorithms  and  object  attributes,  MILT  2.0  allows  more  than 
one  object  of  a  certain  type  to  be  used  in  the  same  room.  This  also  allows  students  to  work  with 
adjectives  and  allows  students  to  carry  objects  from  one  scene  to  another. 

Inventory 

In  version  1.x  the  agent  could  not  carry  items  from  scene  to  scene  and  could  not  just  hold 
onto  an  item.  For  example,  a  legal  command  was  “carry  the  book  to  the  table”,  but  “carry  the 
book”  was  not.  Commands  such  as  “pick  up”  and  “hold”  were  not  recognized  commands. 

In  version  2.0,  MILT  both  the  ability  to  carry  objects  from  scene  to  scene  and  the  ability 
to  hold  multiple  items  were  developed.  The  ability  to  “inventory”  multiple  items  made  these 
new  capabilities  possible. 

Saving  Exercise  Progress 

In  version  1  .x  a  student  could  exit  MILT  during  a  lesson  and  had  the  option  of  saving 
their  progress  so  that  they  could  resume  the  lesson  at  a  later  time.  MILT  stored  the  student  score 
thus  far  in  the  lesson  and  the  exercise  that  should  be  resumed.  However,  MILT  did  not  store 
individual  microworld  exercise  status/progress  -  e.g.  if  a  student  exited  MILT  while  in  the  2nd 
room  of  a  microworld  exercise,  when  they  resumed  the  lesson  they  would  be  placed  at  the 
beginning  (1st  room)  of  the  micro  world  exercise. 
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It  is  envisioned  that  MILT  2.0  microworld  exercises  would  be  longer  in  duration  with 
more  scenes  and  objects  than  MILT  1.0.  In  order  to  support  these  longer  exercises  and  allow  the 
student  to  use  the  software  over  multiple  sessions,  the  ability  to  save  exercise  progress  was 
developed. 

Improved  Object  Interactions 

One  of  the  complaints  of  MILT  1.0  was  its  lack  of  "physical"  world  knowledge.  For 
example,  there  was  no  "gravity"  or  "collision"  mechanism  implemented.  In  MILT  2.0,  the  user  is 
able  to  better  manipulate  objects  and  objects  will  interact  with  each  other  in  a  more  natural 
fashion.  For  example,  objects  can  collide  with  each  other  and  objects  will  fall  to  the  ground. 

Object  Synonym  Authoring 

MILT  1 .0  was  very  limiting  because  the  string  matcher  used  did  not  allow  synonyms  for 
object  names  to  be  used  by  the  student.  The  microworld  only  recognized  one  word  to  describe 
each  object.  This  meant  that  if  the  student  entered  “Open  the  trashcan”  instead  of  “Open  the 
wastebasket”  the  command  was  not  recognized.  In  MILT  2.0,  authors  can  enter  synonyms  for 
objects. 

Continuous  Speech  Recognition 

In  MILT  1.x,  investigation  into  the  use  of  discreet  speech  recognition  in  language 
learning  was  accomplished.  The  Dragon  VoiceTools  discreet  recognition  development  system 
was  used  to  create  an  Arabic  user  model  and  vocabulary  file  for  73  different  microworld  specific 
utterances  and  the  speech  recognition  engine  was  integrated  with  MILT  1.0.  Students  issued 
spoken  commands  to  the  MILT  microworld  in  Arabic  by  selecting  among  three  alternative 
statements  displayed  at  the  bottom  of  the  computer  screen.  The  issuance  of  a  command  triggered 
the  animation  and  caused  three  new  commands  to  be  displayed  at  the  bottom  of  the  screen.  The 
author  could  change  the  sequence  of  commands.  However,  adding  new  commands  required  the 
use  of  the  native  dragon  development  software. 

Several  problems  were  identified  with  the  MILT  1.x  discreet  recognition  approach: 

1 .  it  resulted  in  oral  reading  practice,  but  not  in  natural  speech  production 

2.  since  alternative  commands  were  read  from  the  screen,  the  natural  language  processing  error 
analysis  capability  could  not  be  used 

3.  the  author  was  required  to  engage  in  a  time-consuming  effort  to  author  the  sequence  of  three 
possible  commands  displayed  to  the  student 

In  Version  2.0,  we  attempted  to  solve  these  problems  and  expand  the  utility  of  MILT  by 
incorporating  a  continuous  speech  recognition  engine. 

The  continuous  speech  recognition  software  used  is  the  HTK  software.  West  Point 
Military  Academy  developed  the  Arabic  language  and  acoustic  continuous  speech  recognition 
modules  for  use  in  MILT  2.0.  Micro  Analysis  and  Design,  Inc.  developed  the  English  and 
Spanish  language  modules.  The  English  and  Spanish  Entropic  HTK  acoustic  modules  were 
used.  The  CSR  development  is  discussed  further  in  tasks  7  and  8. 
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Microworld  Authoring  Environment 

Authoring  Overview 

The  general  authoring  approach  used  in  MILT  to  develop  new  lessons  did  not  change  in 
MILT  2.0.  The  MILT  Authoring  module  was  designed  so  that  instructors  without  programming 
expertise  can  easily  develop  lessons.  The  lesson  is  defined  using  a  graphical  flowchart  approach 
and  each  individual  exercise  is  defined  through  the  use  of  template  screens.  The  MILT  lessons 
are  developed  and  stored  in  a  hierarchical  format.  Authors  create  lessons,  and  lessons  consist  of 
one  or  more  groups,  or  sets,  of  exercises. 

Once  the  author  has  graphically  defined  an  exercise,  he  can  define  the  exercise  type  and 
parameters  by  clicking  on  an  exercise  node.  MILT  first  displays  the  Exercise  Attributes  window 
(Figure  2). 


Exercise  Attributes 


Microworld  1 
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3D  MicroWorld  Exercise 


Cloze/Menu  Built  Sentence 
End  of  Exercise  Set  Node 
Fill  in  the  Blank 
ID  Location 
Introduction/T  utorial 
Multiple  Choice 
Pronunciation  Exercise 
Question  and  Answer 
Sorting  Exercise 
T  ranslation/T  ranscription 


Figure  2  -  Exercise  Attributes  Window 


The  Exercise  Attributes  window  allows  the  author  to  define  the  type  of  exercise,  name 
the  exercise,  renumber  the  exercise,  record  any  comments  about  the  exercise,  and  access  the 
exercise  parameters. 


To  begin  defining  a  microworld  exercise  the  author  selects  the  ‘3D  MicroWorld  Exercise’ 
option  then  clicks  on  the  ‘Edit  Exercise’  button. 
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Microworld  Editor 

After  clicking  ‘Edit  Exercise’  MILT  will  display  a  window  similar  to  Figure  3.  From  this 
window  the  author  defines  the  exercise  characteristics  such  as  the  number  of  scenes  in  the 
exercise,  the  exercise  textures,  the  objects  that  will  be  included  in  each  scene,  the  location  of 
scenes,  and  object  attributes. 


Figure  3  -  Microworld  Editor 


The  microworld  exercise  allows  authors  to  link  several  rooms  or  scenes  together  in  order 
to  make  a  longer  and  more  interesting  game  environment.  A  screen  similar  to  Figure  3  shows  the 
overall  mapping  that  has  been  defined  for  the  current  microworld  exercise.  The  author  is  shown 
an  outside  environment  and  places  new  scenes  (buildings)  and  objects  on  this  outside 
“microworld.”  The  author  can  double  click  on  any  existing  scene  or  select  a  scene  name  from  the 
scene  menu  to  begin  editing  any  particular  scene.  The  interface  was  designed  so  that  when  an 
author  wants  to  change  a  given  room  that  may  be  deep  into  the  microworld  scene  structure,  he 
can  go  there  directly  rather  than  going  through  all  of  the  rooms  that  lead  to  it. 
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MENU  ITEMS 


The  Menu  Items  for  authoring  3D  MicroWorld  Exercises  are: 

File  -  This  menu  is  used  to  close  the  viewer  window. 

Edit  -  This  menu  is  used  to  define  object  Attributes,  Enter  scenes,  and 
Cut/Copy/Paste/Delete  objects. 

Scene  -  The  Scene  menu  is  used  to  define  New  Scenes,  Edit  the  background  color  of  the 
current  scene,  and  move  to  any  defined  scene. 

Objects  -  This  menu  allows  you  to  display  the  object  Palette. 

View  -  Switch  between  Examiner  Viewer  and  Plane  Viewer  using  this  menu. 

Options  -  The  Texture  Quality  is  set  from  this  menu.  Authors  can  also  test  the  Scene  from 
the  student  perspective. 

Viewpoints  -  Authors  create  new  viewpoints  and  move  to  stored  points  from  this  menu. 

Help  -  Access  the  3D  Microworld  Help  file. 

VIEWERS 

There  are  two  different  viewers  that  can  be  used  to  develop  microworld  scenes.  The 
default  view  is  called  the  ‘Examiner’  view.  The  other  available  view  is  called  a  ‘Plane  Viewer’. 

To  switch  between  views,  users  select  the  ‘Viewing’  menu  and  choose  the  desired  view. 

Either  viewer  allows  the  user  to  interactively  change  the  view  of  the  scene  through  direct 
manipulation,  or  indirect  slider  and  push  button  controls. 

The  Examiner  viewer  component  (Figure  4)  allows  users  to  rotate  the  view  around  a  point 
of  interest  using  a  virtual  trackball.  In  addition  to  allowing  camera  rotation  around  the  point  of 
interest,  this  viewer  also  allows  translating  the  camera  in  the  viewer  plane,  as  well  as  dolly  (move 
forward  and  backward)  to  get  closer  to  or  further  away  from  the  point  of  interest.  The  viewer  also 
supports  seek  to  quickly  move  the  camera  to  a  desired  object  or  point. 
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Figure  4  -  Examiner  Viewer 

The  Plane  Viewer  (Figure  5)  constrains  the  camera  to  move  only  parallel  to  the  view 
plane.  Convenience  buttons  are  provided  (right  side  of  viewer  decorations)  to  set  the  camera  in 
each  of  the  major  planes  (X,  Y,  Z).  The  Plane  viewer  component  allows  the  user  to  translate  the 
camera  in  the  viewing  plane,  as  well  as  dolly  (move  forward/backward)  and  zoom  in  and  out.  The 
viewer  also  allows  the  user  to  roll  the  camera  (rotate  around  the  forward  direction)  and  seek  to 
objects  which  will  specify  a  new  viewing  plane. 
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Figure  5  -  Plane  Viewer 


EDIT  SCENE  TOOLS 

A  toolbar  is  displayed  to  the  right  of  the  view.  Each  tool  is  described  below.  Note:  Some 
of  the  tools  are  not  available  in  both  viewers. 

Select/Pick  Button 

The  author  will  use  this  tool  to  place  objects  in  the  scene,  resize  objects,  rotate  objects, 
and  define  object  attributes.  The  cursor  shape  will  change  to  an  arrow. 

View  Button 

Selects  viewer  mode.  The  cursor  shape  will  change  to  a  hand  icon.  In  this  mode,  the 
author  can  move  the  camera  in  3D  space. 

Help 

This  menu  provides  help  about  the  application. 


itU  Home  Button 

Returns  the  camera  to  its  home  position  (initial  position  if  not  reset).  Use  this  tool  if  you 
have  “lost  yourself’  in  the  3D  environment. 


Set  Home  Button 


Resets  the  home  position  to  the  current  camera  position. 


View  All  Button 


Brings  the  entire  scene  into  view. 


Seek  Button 


Allows  the  user  to  select  a  new  center  of  rotation  for  the  camera.  When  clicked  on  (and  in 
viewer  mode)  the  cursor  changes  to  a  crosshair.  The  next  left  mouse  button  press  causes 
whatever  is  underneath  the  cursor  to  be  selected  as  the  new  center  of  rotation.  Once  the  button  is 
released,  the  camera  either  jumps  or  animates  to  its  new  position  depending  on  the  current  setting 
of  the  seek  time  in  the  preferences  dialog  box. 

JU  Camera  Alignment  Buttons 


Select  the  axis  of  alignment  (X,  Y,  or  Z)  of  the  camera.  Use  these  buttons  to  change  the 
currently  displayed  view.  Only  displayed  for  Plane  Viewers. 


Projection  Button 


It  is  used  to  select  the  type  of  camera  used  by  the  viewer.  It  toggles  between  the  two 
available  camera  types  —perspective  and  orthographic.  Only  displayed  for  Examiner  and  Plane 
viewers.  The  Dolly  thumbwheel  is  only  available  to  the  perspective  camera. 


Rotate/Translate  X  Wheel 

Rotates  (Examiner  View)  or  Translates  (Plane  View)  scene  about  screen  X  axis. 
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Rotate/Translate  Y  Wheel 


Rotates  (Examiner  View)  or  Translates  (Plane  View)  scene  about  screen  Y  axis. 


Dolly  Wheel 

This  wheel  moves  the  camera  in  and  out. 

Zoom  Slider 

Adjusts  the  camera  field  of  view.  Field  of  view  is  specified  in  degrees. 


MOUSE  &  COMMON  CONTROLS 


The  3D  authoring  environment  uses  a  ‘virtual  trackball’  to  view  the  defined  scene. 


Control 

Action 

Left  Mouse  Button 

*  View  Mode  Only  (hand  cursor) 

Spin  (rotate  the  virtual  trackball) 

Middle  Mouse  Button  or 

<CTRL>+Left  Mouse  Button 

*  View  Mode  Only  (hand  cursor) 

Pan  (translate  up/down/left/right) 

<CTRL>  +  Middle  Mouse  Button  or 

Left  Mouse  Button  +  Middle  Mouse  Button 

*  View  Mode  Only  (hand  cursor) 

Dolly  in  and  out 

Right  Mouse  Button 

Display  the  Viewer  Popup  Menu 

<s>  +  Click 

Seek  Mode 

*  View  Mode  Only  (hand  cursor) 
<ALT> 


Switch  temporarily  to  Viewing  Mode 


Seek  Mode  -  After  pressing  (and  releasing)  the  <s>  key,  the  cursor  changes  to  a  "target"  shape. 
Click  on  the  desired  seek  point  with  the  left  mouse  button.  After  clicking  on  the  seek  point,  the 
camera  will  be  automatically  moved  to  view  the  seek  point. 

Displaying  the  Viewer  Popup  Menu  -  The  popup  menu  items  allow  you  to  change  most  of  the 
viewer  properties,  such  as  drawing  mode  ("as  is",  "wireframe",  etc.).  There  are  also  popup  menu 
items  corresponding  to  most  of  the  viewer  decoration  buttons  (Home,  Set  Home,  View  All,  etc.). 

Switching  temporarily  to  Viewing  Mode  -  When  the  viewer  is  in  Selection  mode,  pressing  and 
holding  the  ALT  key  temporarily  switches  the  viewer  to  Viewing  mode.  When  the  ALT  key  is 
released,  the  viewer  returns  to  Selection  mode.  Note:  If  any  of  the  mouse  buttons  are  currently 
depressed,  the  ALT  key  has  no  effect. 

Adding  New  Objects 


Objects  are  items  that  the  author  can  place  in  a  microworld  scene  and  that  students  can 
manipulate  during  the  exercise. 

Authors  can  add  new  objects  to  the  scene  by: 

1)  Displaying  the  Object  Palette  (Figure  6)  by  selecting  'Palette'  from  the  'Object'  menu. 

2)  Choosing  an  object  to  display  from  the  list  (Either  double-click  on  the  item  or  highlight  it  and 
click  on  the  'Apply'  button) 

3)  Selecting  the  select  tool  (the  arrow) 

4)  Clicking  in  the  displayed  view. 


MILT  will  place  the  chosen  object  in  the  position  where  the  mouse  was  clicked. 


Collision  detection  was  implemented  so  that  authors  can  not  place  objects  through  existing 
objects.  Authors  can  place  objects  on  top  of  other  objects,  but  cannot  place  them  through  any 
other  object.  New  objects  can  be  added  to  the  outside  "world"  or  to  any  scene. 


Cassette  Recorder 
:S|  Cassette  T  ape 
|  BackPack 
I  Bag 

!=!  Baled  Up  Paper 
§  Book 
|  Bookcase 
Box 
H  Briefcase 
”:j:  Bureau 
Chair 
||  Closet 
Coat 

l  Coffee  Cup 


Compass 
Desk 
Door 
Envelope 
Fax  Machine 
Fence 
Fie  Cabinet 
Fie  Folder 
Gate 
Hat 

Id  Card 
Key 
Lamp 
Letter 
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Figure  6  -  Object  Palette 


The  objects  available  are: 
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bag 

balled  up  paper 

book 

bookcase 

box 

briefcase 

bureau 

cassette  recorder 

chair 

closet 

coat 

coffee  cup 

desk  w/  drawers 

door 

fax  machine 

file  cabinet 

file  folders 

gate 

hall 

hat 

id  card 

key 

lamp 

letter 

map 

notebook 

PA  system 

pants 

paper 

photo 

plant 

pop  can 

radio 

secret 

compartment 

secret  door 

shelf 

shirt 

shoes 

table 

television 

tent 

VCR 

wall  safe 

wastebasket 

white  board 

window 

Table  1  -  Object  List 


Changing  the  Object  Size 
To  change  the  size  of  an  object: 

1 .  click  on  the  select  tool  (the  arrow)  for  one  of  the  displayed  views 

2.  click  on  an  object  (the  selected  object  will  have  a  box  around  it  displayed) 

3 .  click  on  one  of  the  square  anchor  points  (the  square  will  become  highlighted) 

4.  click  and  hold  the  mouse  while  dragging  the  anchor  until  the  desired  size  is  obtained 

5.  release  the  mouse 

The  object  size  can  only  be  made  twice  as  big  or  80%  smaller  than  the  default  object  size. 
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Figure  7  -  Changing  Object  Size 


Moving  Objects  within  a  Scene 
To  move  a  object  within  a  scene: 

1 .  click  on  the  select  tool  (the  arrow)  for  one  of  the  displayed  views 

2.  click  on  an  object.  A  “selection”  box  will  appear  around  it 

3 .  click  within  the  box.  Arrows  will  appear  indicating  the  plane  of  movement 

4.  click  and  hold  the  mouse  on  the  object  to  move 

5.  drag  the  mouse  to  the  newly  desired  position 

6.  release  the  mouse 


Figure  8  -  Moving  Objects 


Rotating  Objects 

To  rotate  an  object  in  the  scene: 

1 .  Click  on  the  select  tool  (the  arrow) 


2.  Click  on  the  object  to  rotate  (the  selected  object  will  have  a  box  around  it 
displayed) 

3.  Click  on  one  of  the  lines  surrounding  the  object  (the  line  will  become  highlighted) 
Note:  the  line  selected  determines  the  rotation  direction. 

4.  Click  and  hold  the  mouse  while  dragging  the  cursor  until  the  desired  rotation  is 
obtained 

5.  Release  the  mouse 


Figure  9  -  Rotating  Objects 

Defining  Object  Attributes 

Attributes  are  authorable  components  of  objects.  For  example,  text,  color,  sign,  and  open 
are  attributes  of  a  book  object.  For  each  book  object  placed  in  the  microworld  the  author  is  able 
to  change  the  text  contained  in  the  book,  the  size  of  the  book,  the  color  of  the  book,  and  the  title 
of  the  book.  The  open  attribute  is  also  inherent  in  the  book  object  (it  can  be  opened  or  closed). 

The  authorable  attributes  available  in  Version  2.0  are: 

BMP  -  the  ability  to  display  various  sorts  of  bitmaps 

Color  -  the  ability  to  change  object  color 

Combination  -  the  ability  to  be  opened  with  a  2  digit  combination 
Connect  -  the  ability  to  link  to  another  room  when  passed  through 
Container  -  the  ability  to  put  things  into  the  object 
Hang  -  can  be  placed  on  a  wall 

Light  -  the  ability  to  change  the  visibility  of  a  room  when  turned  on. 

Lock  -  the  object  has  the  ability  to  be  locked. 

Open  -  the  ability  to  open  as  in  a  door,  drawer,  envelope,  box,  etc. 

Random  -  it  performs  an  action  on  its  own  at  a  random  time  after  entry  into  room. 
Rotation  -  the  object  can  be  rotated  in  3  dimensional  space 
Sign  -  the  ability  to  create  text  that  appears  on  the  object. 

Size  -  the  ability  to  be  different  sizes. 
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Size  -  the  ability  to  be  different  sizes. 

Text  -  the  ability  to  author  new  text  for  that  object. 

Unlock  -  the  object  has  the  ability  to  unlock  specifically  assigned  locked  objects 
Video  -  the  ability  to  play  defined  digitized  video  file 
WAV  -  the  ability  to  play  WAV  files. 

Authors  can  specify  attributes  for  any  object  in  the  scene  by  double-clicking  on  the 
object.  Each  object  has  a  unique  set  of  authorable  attributes. 

Figure  10  shows  what  the  object  attributes  window  looks  like.  Only  applicable  attributes 
are  shown  for  each  object. 
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*  Object  Attributes 
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Armored  Force  Manual 


xxx.  wav 


T um  Right  to  Number:  1 


gMjBBMlMi 

Doors 

File  Cabinets 
Briefcases 


Figure  10  -  Object  Attribute  Window 
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The  following  table  lists  the  attribute,  how  the  author  can  change  it,  and  any  constraints 
on  defining  the  attribute. _ _ 


Attribute 

Objects  w/ 

Attribute 

Authoring 

Constraints 

BMP 

fax  machine,  id 
card,  map,  photo, 
picture 

Enter  name  of 
bitmap  file  into  the 
graphic  filename 
field 

graphic  must  be 
bitmap  (.bmp) 

Color 

All 

Select  color  from 
palette 

10  colors 

Combination 

wall  safe 

Enter  a  2  digit 
number  in  the  ‘Turn 
Right  to  Number’ 
field  and  a  2  digit 
number  in  the  ‘Turn 
Left  to  Number:’ 
field. 

Numbers  must  be 
between  00  and  10 

Connect 

door,  window, 
secret  panel 

Click  on  “New 

Scene”  button;  pick 
scene  type  (room, 
outside,  etc.) 

Can  only  be  placed 
from  within  scene 

Container 

backpack,  bag, 
bookcase,  box, 
briefcase,  cassette 
recorder,  closet, 
coat,  coffee  cup, 
desk,  envelope,  file 
cabinet,  file  folder, 
pants,  pop  can, 
secret  panel  1,  shirt, 
shoes,  VCR,  wall 
safe,  wastebasket 

Click  on  object 
representation  of 
object  to  be  placed 
inside,  click  in  the 
‘inside  object’  view. 
MILT  will  place 
object  where 
clicked. 

Double  click  on 
object  displayed  in 
‘inside  object’  view 
to  edit. 

Only  objects  that  are 
smaller  than 
container  object  can 
be  placed  inside  it. 

Cassette  recorder 
and  VCR  can  only 
have  appropriate 
tapes  placed  inside. 

Hang 

picture,  PA  system, 
shelf,  white  board, 
wall  safe 

Place  any  of  these 
objects  on  a  wall 
and  they  will  stay 

Any  object  besides 
those  listed  here  will 
fall  to  floor  when 
placed  on  a  wall 

Label 

All 

enter  synonyms 

none 

22 


Attribute 

Objects  w/ 

Attribute 

Authoring 

Constraints 

Label  Sound  File 

All 

Enter  the  name  of 
desired  file  into  the 
Sound  Filename 
field  or  record 
sound  file 

Must  be  WAV 
format. 

User  computer  must 
have  sound  card. 

Light 

lamp 

Select  either  the  On 
or  OFF  radio  button 
to  determine  initial 
state 

none 

Lock 

briefcase,  door,  file 
cabinet,  wall  safe 

Select  either  Locked 
or  Unlocked  to  set 
initial  state 

The  wall  safe  is 
opened  by  a  two 
number 

combination;  all 
other  objects  are 
opened  with  a  key. 

Open 

backpack,  bag, 
balled  up  paper, 
book,  box, 
briefcase,  cassette 
recorder,  closet, 
desk,  door, 
envelope,  file 
cabinet,  file  folder, 
gate,  map, 
newspaper, 
notebook,  secret 
panels,  VCR,  wall 
safe,  wastebasket, 
window 

None 

Select  either  Open 
or  Closed 

Random 

fax  machine,  PA 

Select  ON  to 

Due  to  random 

system 

activate  or  OFF  to 

nature  may  or  may 

deactivate 

not  be  triggered 

while  student  is  in 

the  room 
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Attribute 


Authoring 


Constraints 


Objects  w/ 
Attribute 


Rotation 

All 

Sign 

bag,  book, 
bookcase,  cassette 
tape,  coffee  cup, 
door,  file  cabinet, 
file  folder,  key, 
notebook,  video 
tape 

Size 

All 

Text 

balled  up  paper, 
book,  envelope,  fax 
machine,  id  card, 
letter,  newspaper, 
notebook,  paper, 
sticky  note,  white 
board 

Unlock 

key 

Select  desired  angle 
and  dimension  from 
Object  Rotation 
field 

Supported 
dimensions  are 
horizontal  and 
vertical.  Supported 
rotation  angles  are 
90°,  180°,  and  270°. 
No  intermediate 
rotation  is  allowed. 

Enter  text  in 
Sign/Title  text  field 

maximum  of  64 
characters 

Enter  scale  factor  in 
percentage 

All  objects  are 
scaled  to  be  of  the 
correct  relative  size. 
Smallest  size  50%, 
largest  200%. 

Click  in  the  Text 
field  and  enter  text 
using  the  keyboard 

All  objects  except 
the  book  have  1  text 
field.  The  book  has 
two  text  fields. 

Select  1  or  more  Keys  will  not  be 
classes  of  objects  specific  to  any 
that  the  key  can  individual  objects, 
unlock  from  the  but  will  be  ties  to 


Unlock  field  classes  of  objects. 

For  example,  the 
blue  key  might 
unlock  all  doors,  the 
red  key  all 
briefcases,  and  the 
gold  key  all  locked 
objects. 
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Attribute 

Objects  w/ 

Attribute 

Authoring 

Constraints 

Video 

television,  video 
tape 

Enter  the  name  of 
file  into  the  Video 
Filename  field 

Must  be  avi  format. 

User  computer  must 
support  video 
formats. 

Sound 

cassette  tape,  PA 
system,  radio 

Enter  the  name  of 
desired  file  into  the 
Sound  Filename 
field  or  record 
sound  file 

Must  be  WAV 
format. 

User  computer  must 
have  sound  card. 

Table  2  -  Objects  and  Attributes 

Defining  objects  that  are  inside  another  object 

Objects  that  can  have  other  objects  placed  inside  them  are  called  Container  Objects.  To 
place  objects  inside  another  object: 

1 .  Double-click  on  an  obj ect 

2.  MILT  will  display  the  object  attribute  window 

3.  From  the  list  of  items  that  can  fit  in  the  container,  select  item(s)  to  place  inside  and  click 
arrow  to  move  object(s)  inside  container  item 

4.  Click  the  OK  button 

To  edit  the  attributes  of  any  object  that  is  within  the  container  double  click  on  the  object 
name  from  the  'Objects  Inside  Container'  list. 
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Figure  11  -  Example  Container  Attribute  Window 
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When  the  author  double  clicks  on  a  scene  representation  for  a  scene  that  does  not  yet 
have  any  characteristics  or  objects  defined,  MILT  will  prompt  the  user  to  specify  the  type  of 
scene  to  be  added.  The  screen  looks  similar  to  Figure  12. 


Figure  12  -  Scene  Type  Declaration 

MILT  automatically  assigns  a  scene  number  to  the  new  scene.  The  scene  type  defines  the 
dimensions  inside  the  new  scene.  MILT  2.0  provides  different  scene  types  of  predefined  size  for 
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the  author  to  chose  from.  The  different  scenes  range  in  size  and  shape.  The  author  will  place  the 
selected  building  type  on  the  outside  scene. 

Editing/Defining  Scene  Characteristics 

Scene  attributes  are  name  and  color. 

To  edit  scene  attributes: 

1)  Click  on  the  Scene  with  the  Select  Tool  (the  arrow) 

2)  Select  'Attributes'  from  the  'Edit'  menu 

3)  Click  in  the  'Name'  field  and  enter  the  scene  name  and/or 

4)  Click  on  the  color  list  and  select  one  of  the  displayed  options 


Figure  13  -  Scene  Characteristics 

Editing  Scene  Background  Color 

The  background  color  for  any  scene  can  be  specified  by: 

1)  Selecting  'Edit  Background  Color'  from  the  'Scene'  menu 

2)  Selecting  a  color  from  the  color  wheel  or  slider  bar(s) 
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Figure  14  -  Scene  Background  Color 


The  background  color  chosen  for  the  Outside  "world"  is  displayed  as  the  sky  and  space 
around  the  view.  Background  colors  for  internal  scenes  are  only  seen  as  the  space  outside  the 
scene  boundaries. 

Entering  a  Scene 

Once  a  scene  has  been  defined  in  the  Outside  "world"  authors  can  enter  the  scene.  Within 
any  scene  the  author  can  place  new  objects  and  define  object  attributes. 

There  are  two  methods  that  can  be  used  to  enter  a  scene. 

To  use  the  first  method: 

1)  Click  on  the  scene  representation  with  the  Select  tool  (the  arrow) 

2)  A  square  selection  box  will  appear  around  the  scene 


Figure  15  -  Select  Scene 


3)  Select  'Enter'  from  the  'Edit'  menu 

The  second  method  involves  selecting  the  scene  name  to  enter  from  the  'Scene'  menu 
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Editing  Scene  Boundary  Attributes 

Boundaries  are  the  “edges’  of  each  scene.  The  dimensions  of  the  boundaries  are 
determined  by  the  scene  type  (room,  outside,  corridor,  etc.)  and  cannot  be  changed  by  the  author. 

Authors  can  change  the  “look”  of  the  boundary  by  applying  a  background  texture  from 
the  MILT  library  or  applying  a  custom  texture  to  the  wall. 

To  define  the  boundary  attribute: 

1)  From  within  any  scene,  double  click  on  a  boundary  from  the  Edit  Scene  Window. 

2)  MILT  will  display  the  boundary  attribute  window.  (Figure  16) 

First,  choose  which  boundary  to  edit  by  selecting  the  name  from  the  'Boundary'  list. 

Then,  the  texture,  color,  and/or  texture  scale  of  that  boundary  can  be  changed. 

Change  the  texture  by  clicking  on  the  'texture'  button  and  selecting  from  the  list  (brick, 
wood,  light  carpet,  cobblestone,  sky,  grass,  tile,  stucco). 

Change  the  color  of  the  selected  boundary  by  moving  the  Red,  Green  and/or  Blue  slider 

bars. 

Change  the  texture  scale  by  clicking  on  the  appropriate  horizontal  and/or  vertical 
selection  buttons.  The  texture  scale  refers  to  the  number  of  horizontal  and  vertical  tiles  of  the 
texture  are  applied  to  the  boundary.  MILT  comes  with  reasonable  default  settings  for  each 
texture.  Note:  if  a  texture  is  not  selected  this  option  will  not  work. 

Light  intensity  slider  bars  allow  the  author  to  set  the  room  brightness. 
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Figure  16  -  Scene  Boundary  Attributes 


^onnect/Transition  Attribute 


Certain  objects  (doors,  windows,  and  secret  panels)  allow  the  student  to  transition  from 
one  scene  to  another  by  going  through  the  object. 

To  link  scenes  together,  the  author  must 

1)  Define  the  scenes  in  the  Outside  "world"  (See  Defining  A  New  Scene) 

2)  Enter  one  of  the  scenes  (See  Entering  a  Scene) 

3)  Add  a  transition  object  to  the  scene  (See  Adding  New  Objects) 

4)  Click  on  the  object  with  the  Select  Tool  (the  arrow) 
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5)  A  selection  box  will  be  displayed  around  the  object 


Figure  17  -  Selection  Box 


6)  Move  the  object  onto  one  of  the  boundary  walls 

7)  Select  'Attributes'  from  the  'Edit'  menu 

8)  Specify  the  scene  the  object  connects  to  from  the  'Connected  to  Scene'  list  box 

9)  Click  on 'OK' 

Setting  Texture  Quality 

The  texture  quality  affects  the  speed  that  the  micro  world  exercise  executes.  The  higher 
the  texture  quality  the  slower  the  speed. 

To  set  the  texture  quality: 

1)  Select  'Texture  Quality"  from  the  'Options'  menu 

2)  Enter  a  value  between  0.0  and  1 .0  (recommended  value  is  0.5) 

3)  Click  on  the  'OK'  button 
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Figure  18  -  Texture  Quality 


Using  Viewpoints 

Viewpoints  are  essentially  anchor  points  that  authors  may  want  to  return  to  when  defining 
a  scene.  By  setting  viewpoints  the  current  camera  position  is  stored. 

To  set  a  viewpoint: 

1)  Move  around  the  scene  using  the  viewer  controls  until  the  camera  is  positioned  where 
you  want  it 

2)  Select  'Create  Viewpoint'  from  the  'Viewpoint'  menu 

3)  Enter  a  name  for  the  viewpoint  that  describes  the  current  position 

4)  Click  the  'OK'  button 


Figure  19  -  New  Viewpoint 


Moving  to  a  Viewpoint: 

1)  Click  and  hold  the  mouse  on  the  'Viewpoint'  menu 

2)  Move  the  mouse  down  until  the  desired  Viewpoint  name  is  highlighted 

3)  Move  the  mouse  to  the  left  until  'Seek'  is  highlighted 

4)  Release  the  mouse 

The  view  will  change  to  the  stored  viewpoint. 
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Figure  20  -  Viewpoint  Menu 


Removing  to  a  Viewpoint: 

1)  Click  and  hold  the  mouse  on  the  'Viewpoint'  menu 

2)  Move  the  mouse  down  until  the  desired  Viewpoint  name  is  highlighted 

3)  Move  the  mouse  to  the  left  until  'Remove'  is  highlighted 

5)  Release  the  mouse 

Test  Scene/View  Exercise 

Once  a  microworld  scene  has  been  defined,  the  author  can  test  it  from  author  mode.  This 
will  allow  authors  to  view  the  newly  created  scene  from  the  perspective  of  the  student. 

In  the  student  perspective,  users  issue  commands  to  move  around  the  microworld  and 
manipulate  objects.  The  goal  is  to  find  the  answer  to  a  question.  Students  will  see  a  window 
with  a  3D  microworld  display  area  and  command  entry  area.  To  test  the  exercise,  issue 
commands  to  search  the  scene  or  scenes  for  clues  to  answer  the  question. 

There  are  two  possible  modes  of  input  for  the  microworld.  In  one  mode,  you  can  issue 
commands  by  typing  into  the  text  field.  In  the  second  mode,  you  can  issue  commands  by 
speaking  into  a  microphone.  Switch  between  modes  by  clicking  on  the  Talk  checkbox. 

There  are  only  certain  actions  that  are  recognized  in  the  microworld.  If  the  command  , 
issued  to  the  microworld  is  recognized  you  will  see  an  action  occur  in  the  microworld  display. 

The  Microworld  Exercise  from  the  Student  Perspective 

Overview 

In  the  microworld  exercise,  the  student  sees  a  window  with  a  3D  microworld  display  area 
and  command  entry  area  similar  to  Figure  21 .  In  this  exercise,  the  student  issues  commands  to 
search  the  scene  or  scenes  for  clues  to  the  answer  to  the  question.  There  are  two  modes  of  the 
microworld.  In  one  mode,  the  student  issues  commands  by  typing  into  a  text  field.  In  the  second 
mode,  the  student  issues  commands  by  speaking  into  a  microphone.  When  the  student  feels  that 
he  has  the  correct  answer  to  the  question,  he  will  select  ‘Check  Answer’  from  the  Exercise  menu 
and  type  the  suspected  answer  into  the  space  provided.  To  review  the  question,  the  student 
selects  ‘View  Question’  from  the  Exercise  menu. 


33 


Figure  21  -  Microworld  Exercise 


Actions 

The  microworld  exercise  will  only  allow  a  predefined  and  limited  set  of  actions  to  occur. 

The  actions  that  are  available  in  Version  2.0  are: 

carry  the _ 

carry  the _ to  the _ 

climb  the _ 

climb  on  the _ 

climb  off  the _ 

climb  over  the _ 

climb  under  the _ 

climb  through  the _ 

close  the _ 

cover  the  with  the 
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crawl  through  the _ 

crawl  under  the _ 

drop  the _ 

eat  the _ 

enter  the _ 

get  down  from  the _ 

get  off  the _ 

get  the _ 

get  the _ from  the _ 

get  the _ from  inventory 

get  up 

give  the _ to  the _ 

go  back 

go  in  the _ 

go  left 
go  East 
go  North 
go  South 
go  West 
go  right 

go  through  the _ 

go  to  the _ 

hang  the  picture  on  the  wall 

hear  the _ 

hit  the _ 

hold  the _ 

insert  the _ into  the _ 

insert  the _ 

jump  over  the _ 

jump  down 
jump  off  the _ 


jump  on  the _ 

jump  to  the _ 

kick  the _ 

knock  on  the _ 

leap  over  the _ 

leap  onto  the _ 

leave  the _ on  the _ 

leave  the _ 

left  to  (followed  2  digits) 

lift  the _ 

lift  up  the _ 

listen  to  the _ 

load  the  audio  cassette 
load  the  video  cassette 

lock  the _ 

look  at  the _ 

look  behind  the _ 

look  down 
look  in  inventory 
look  up 
look  to  the  left 
look  to  the  right 

look  in  the _ 

look  in  your _ 

look  under  the _ 

look  North 
look  South 
look  West 
look  East 

move  all  but  the _ 

move  forward 
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move  left 
move  right 

move  the _ 

move  the _ to  the _ 

move  the _ under  the _ 

move  the _ _  from  the _ to  the _ 

move  the _ behind  the _ 

open  the _ 

pause  the _ 

pay  the  man 

pick  up  the _ 

place  the _ on  the _ 

place  the _ under  the _ 

place  the _ behind  the _ 

play  the _ 

put  down  the _ 

put  the _ back 

put  the _ in  inventory 

put  the  cassette  in  the  cassette  player 
put  the  cassette  in  the  VCR 

put  the _ down 

put  the _ on  the _ 

put  the _ under  the _ 

put  the _ in  the _ 

read  the _ 

remove  the _ from  inventory 

retrieve  the _ 

right  to  (followed  by  2  digits) 

run  to  the _ 

search  for _ 

search  the 
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see  the _ 

set  the _ on  the _ 

set  the _ under  the _ 

show  the _ 

sit 

sit  on  the _ 

sit  under  the _ 

stand 

stand  on  the _ 

start  the _ 

steal  the _ 

stop  the _ 

take  the _ 

take  the _ out  of _ 

take  the _ from  inventory 

take  the  picture  down 

taste  the _ 

the _ goes  into  the _ 

throw  the _ 

touch  the _ 

turn  around 
turn  left 
turn  right 

turn  off  the _ 

turn  on  the _ 

turn  right 
turn  the  key 

unfold  the _ 

unroll  the _ 

unlock  the _ 

walk  forward 
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walk  left 
walk  right 
walk  North 
walk  South 
walk  East 
walk  West 
walk  straight  ahead 

walk  to  the _ 

walk  to  the  end  of  the  hall 
walk  through  the _ 

If  the  command  issued  to  the  microworld  is  recognized  the  student  will  see  an  action 
occur  in  the  microworld  display. 

“Look  and  Feel” 


THE  FIRST  PERSON  VIEW 

The  first  person  view  and  3D  nature  of  the  microworld  enables  the  view  to  change  as  the 
student  issues  commands.  The  students  view  is  essentially  like  a  camera  moving  around  3D 
space.  The  camera  can  zoom  in,  zoom  out,  rotate  horizontally,  and  rotate  vertically.  For 
example,  if  a  student  issues  the  command  “walk  to  the  table”  the  student  sees  the  table  become 
closer  and  closer  in  the  view  until  the  table  is  reached. 

REACHING 

If  the  student  reaches  for  an  object  that  will  go  into  inventory  the  object  will  get  closer 
and  closer  in  the  view  then  disappear. 

If  the  student  picks  up  an  object  and  carries  it  to  another  location,  the  object  will  get 
closer  and  closer  in  the  view,  the  camera  will  travel  to  the  “go  to  location”,  and  the  object  will 
move  father  from  the  view. 

Example:  if  a  book  is  on  the  chair  and  the  student  issues  the  command  “carry  the  book  to 
the  table.”  The  camera  will  move  to  the  chair,  the  book  will  move  to  the  camera  (get  bigger), 
then  the  book  will  disappear  from  view,  the  camera  will  move  to  the  table,  the  book  will  appear 
close,  then  the  book  will  move  onto  the  table  (getting  smaller). 

Body  parts  (arms,  hands,  feet,  etc.)  of  the  1st  person  entity  (the  student)  will  not  appear. 
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Moving  Between  Scenes 

When  a  student  goes  into  a  new  scene  and  turns  around,  he  will  see  a  boundary  scene 
with  the  correct  number  and  type  of  transition  objects  (windows,  doors,  etc)  connecting  to  the 
new  scene  displayed. 

Labels 

Students  will  have  the  ability  to  gain  information  about  objects  by  looking  at  that 
objects  “label”.  The  label  will  display  the  name  of  the  object  and  give  the  student 
information  about  what  actions  can  be  performed  on  it.  Labels  will  be  displayed  when 
the  user  selects  an  object  with  the  mouse.  The  label  will  look  similar  to  Figure  22.  The 
text  on  the  1st  screen  will  appear  in  English.  If  the  exercise  was  constructed  in  a  foreign 
language,  the  student  will  have  the  option  of  also  viewing  the  label  information  displayed 
in  that  language  by  clicking  on  the  “Foreign  Language”  button.  The  student  will  also  be 
able  to  listen  to  a  WAV  recording  that  was  made  by  the  instructor.  The  Foreign 
Language  display  will  be  created  from  the  synonym  lists  generated  by  the  author. 


Figure  22  -  Label  Window 


40 


Object  Inventory 

The  ability  to  inventory  objects  is  required  in  order  to  allow  the  student  to  carry  multiple 
objects  from  place  to  place  and  for  the  student  to  pick  up  multiple  items.  The  inventory 
functionality  was  implemented  using  a  combination  of  automatic  text  recognition  and  the 
“inventory”  button  on  the  screen. 

PLACING  OBJECTS  INTO  INVENTORY 

MILT  automatically  places  objects  in  inventory  whenever  the  student  issues  one  of  the 
following  commands: 

carry  the _ 

hold  the _ 

get  the _ 

pick  up  the _ 

lift  the _ 

lift  up  the _ 

retrieve  the _ 

steal  the _ 

take  the _ 

get  the _ from  the _ 

get  the _ out  of  the _ 

pick  up  the _ from  the _ 

pick  up  the out  of  the _ 

lift  the _ from  the _ 

lift  the _ out  of  the _ 

lift  up  the _ from  the _ 

lift  up  the _ out  of  the _ 

retrieve  the _ from  the _ 

retrieve  the _ out  of  the _ 

steal  the _ from  the _ 

steal  the _ out  of  the _ 

take  the _ from  the _ 

take  the  out  of  the 
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put  the _ in/into  inventory 

place  the _ in/into  inventory 

move  the _ to/into  inventory 

VIEWING  OBJECTS  IN  INVENTORY 

The  collection  of  items  in  inventory  can  be  viewed  by  the  student  either  through  a 
command  or  a  button  push.  Once  either  method  is  invoked,  MILT  will  display  a  window  that 
shows  a  list  of  all  objects  currently  in  inventory. 

The  command  required  to  view  inventory  items  are: 
look  in  inventory 
check  the  inventory 
view  inventory 

The  student  can  also  view  inventory  contents  by  clicking  on  the  ‘Inventory’  button. 

REMOVING  OBJECTS  FROM  INVENTORY 

Items  can  be  removed  from  inventory  in  two  ways.  The  first  method  that  can  be  used  to 
remove  an  item  from  inventory  is  through  the  use  of  commands.  The  following  commands  will 
remove  items  from  inventory: 

Get  the _ from  inventory 

Remove  the  _  from  inventory 

Take  the _ from  inventory 

Retrieve  the  from  inventory 

Move  the _ from  inventory 

Put  down  the _ 

Put  the _ down 

The  second  method  that  can  be  used  to  remove  items  from  inventory  is  by  clicking  on  the 
inventory  button  and  selecting  the  item  from  the  inventory  list. 

Once  the  student  has  selected  an  item  to  remove,  MILT  places  that  item  on  the  ground  in 
the  current  scene. 

The  student  must  remove  an  item  from  inventory  before  he  can  use  it.  For  example,  if  a 
map  is  in  the  inventory  and  the  student  wants  to  look  at  it,  he  will  need  to  remove  it  from 
inventory. 
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Attribute  Display 

Various  objects  in  the  microworld  have  different  attributes  as  shown  in  Table  2.  The 
student  triggers  various  attributes  to  be  changed  or  displayed  by  issuing  appropriate  commands. 

TEXT  DISPLAY 

Objects  that  have  the  authorable  text  attribute  are:  balled  up  paper,  book,  envelope,  fax 
machine,  id  card,  letter,  newspaper,  notebook,  paper,  sticky  note,  and  white  board.  The  student 
can  access  the  text  by  issuing  one  of  the  following  commands: 

read  the  «object» 

look  at  the  «object» 

Once  the  command  has  been  issued  MILT  brings  up  a  text  display  window.  The 
background  for  the  authorable  text  will  be  dependent  on  the  object  and  will  be  a  realistic 
representation  of  the  object. 

The  book  and  notebook  can  have  a  maximum  of  12  pages  displayed.  All  other  objects 
have  only  one  text  field.  The  student  turns  the  pages  by  clicking  on  the  comer  with  the  upturned 
page  showing. 


CONTAINED  ITEM  DISPLAY 

There  are  several  objects  that  can  contain  other  objects.  These  “container”  items  are: 
backpack,  bag,  bookcase,  box,  briefcase,  cassette  recorder,  closet,  coat,  coffee  cup,  desk, 
envelope,  file  cabinet,  file  folder,  pants,  pop  can,  secret  panel  1,  shirt,  shoes,  VCR,  wall  safe,  and 
wastebasket. 

The  student  is  able  to  see  the  contents  of  an  object  using  the  following  commands: 

open  the _ 

look  in  the _ 

MILT  2.0  displays  the  contents  in  the  same  manner  as  MILT  1.x-  once  the  student  enters 
one  of  the  above  commands,  MILT  will  “pop  out”  all  of  the  objects  that  are  contained  inside. 

VIDEO  DISPLAY 

The  television  and  VCR  can  play  video  files.  The  playing  of  the  video  are  triggered  by 
the  following  commands: 

turn  on  the  (VCR,  television) 

play  the  VCR 

The  video  window  does  not  appear  in  the  object.  It  appears  in  a  separate  window  than  the 
microworld  scene. 

Handling  Multiple  Objects  of  the  Same  Type 
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Version  2.0  allows  multiple  objects  of  the  same  type  to  be  used  in  a  single  scene  and 
allows  the  student  to  carry  objects  into  another  scene  that  may  already  have  the  same  type  of 
object  in  it.  This  capability  adds  complexity  to  the  microworld. 

If  more  than  one  object  of  a  given  type  exists  in  the  scene,  MILT  uses  the  following  rules 
to  determine  which  object  is  being  acted  upon  (in  order  of  priority). 

1.  Determine  object  using  differentiating  adjectives  (red,  blue,  big, ...) 

2.  Use  in  object  displayed  in  current  view 

3.  Use  last  direct  object  named 

4.  Use  object  closest  in  proximity 

Continuous  Speech  Recognition 

In  the  speech  recognition  mode  the  student  utters  commands  into  the  microphone  instead 
of  typing  them  in  using  the  keyboard.  As  the  student  speaks,  MILT  displays  in  text  what  the 
recognizer  thinks  the  student  is  saying. 

Task  7:  Develop  Arabic  CSR  models 

The  speech  recognition  components  were  developed  using  the  HTK  application 
developed  by  Entropic  Cambridge  Research  Laboratory.  How  HTK  achieves  recognition  is  best 
understood  in  terms  of  the  structure  of  a  speech  recognizer  illustrated  in  Figure  23. 


Figure  23  -  Speech  Recognition  Components  (Ollason,  1997) 

The  central  element  of  a  speech  recognizer  is  the  decoder.  This  transcribes  continuous 
speech  input  into  a  textual  symbol  sequence  that  an  application  can  directly  process.  It  requires  a 
syntax  network  to  define  the  allowed  word  sequences  appropriate  for  the  recognition  task,  a 
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pronunciation  dictionary  specifying  an  acoustic  model  sequence  for  each  word  in  the  task 
vocabulary  and  a  set  of  acoustic  models  which  model  the  individual  speech  sounds. 

The  HTK  system  developed  by  Entropic  includes  a  dictionary,  decoder  and  a  set  of  pre¬ 
trained  acoustic  models  for  English  and  Spanish.  However,  Arabic  modules  did  not  exist. 

John  Morgan  and  Steve  LaRocca  were  the  principle  developers  at  the  U.S.  Military 
Academy  (USMA)  who  created  the  phone-level  speech  acoustic  hidden  Markov  models  (HMMs) 
for  Arabic.  This  section  describes  the  steps  taken  by  personnel  at  the  USMA  to  develop  the 
components  for  the  Arabic  speech  recognizer.  USMA  followed  the  example  given  in  Entropic's 
Documentation  (Odell,  1997)  to  build  the  acoustic  models  for  Arabic  and  a  language  model  for 
MILT's  Microworld  commands.  The  steps  that  were  followed  are  as  follows: 

Step  1 .  Data  Preparation 

1.1  Creating  the  Prompt  Scripts 

1 .2  Creating  the  Dictionary 

1.3  Recording  The  Data 

1.4  Creating  the  Transcription  Files 

1.5  Coding  The  Data 

Step  2.  Creating  Monophone  HMMs 

2.1  Creating  Flat  Start  Monophones 

2.2  Fixing  the  Silence  Models 

Step  3.  Creating  Tied-State  Triphones 

Step  4.  Creating  The  Language  Model 

Step  5.  Recognizer  Evaluation 


Each  of  the  steps  is  described  in  the  remainder  of  this  section. 

Step  1 :  Data  Preparation 

1 . 1  Creating  the  Prompt  Scripts 

One  of  the  U.S.  Military  Academy  Arabic  instructors,  Mr.  Raja  Chouairi,  wrote  two 
prompt  scripts.  We  used  the  first  script  to  collect  data  from  58  native  Arabic  speakers.  The 
script  contains  155  prompts.  The  lines  in  the  script  roughly  correspond  to  lines  spoken  by  actors 
in  a  play.  Two  characters  meet  in  a  cafe,  where  they  discuss  topics  such  as  family,  food,  religion 
and  relationships.  The  script  has  a  total  of  1 152  words  724  of  which  are  distinct.  We  define  a 
"wordVV  as  the  set  of  characters  delimited  by  white  space.  So  we  consider  "wa-manVV  ("and 
whoYY  in  English)  to  be  a  single  word.  We  used  the  second  script  to  collect  data  from  25 
nonnative  Arabic  speakers.  This  script  only  contains  40  short  sentences  and  covers  topics 
similar  to  the  first  script.  The  normative  script  has  a  total  of  150  words  124  of  which  are  distinct. 
Mr.  Chouairy  wrote  both  scripts  in  Modem  Standard  Arabic  (msa).  He  originally  wrote  the 
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script  in  a  7-bit  ascii  dialect  of  LaTeX  called  arabtex.  This  facilitated  processing  data  with 
HTK.  Later  he  re-wrote  the  scripts  as  a  Wincalis1  script  in  Unicode.  The  WinCalis  script 
contains  both  a  textual  prompt  in  the  Arabic  calligraphy  and  an  auditory  prompt. 

The  scripts  are  shown  below  (in  ArabTeX  format,  See  Appendix  A  for  ArabTeX 
conventions). 

The  first  script  is  as  follows: 

1 .  mar.habaN  mA  Yismuka 

2.  YahlaN  Yismy  raj  A  wa-man  Yanti 

3 .  YanA  rlmA  YanA  mi.sriyyaT  min  Yayna  Yanta 

4.  YanA  min  lubnAn  YanA  lubnAny 

5.  mA_dA  ta'mal  yA  rajA 

6.  YanA  .tAlib  wa-Yanti  mA_dA  ta'malln 

7.  YanA  mudarrisaT  Yayna  taskun  yA  rajA 

8.  Yaskun  fy  ha_dihi  al-min.taqaT  bi-al-qurbi  mina  al-AgAmi'aT 

9.  baytuka  qaryb  mina  al-AgAmi'aT  walakin  manzily  ba'yd  'anhA 

10.  fy  al-.haqyqaT  1A  \'a'yAs  fy  bayt  wa  lakin  fy  AsiqaT 

1 1 .  mAdA  tadrus  yA  rajA 

12.  YanA  Yadrus  al-ttAry_h  wa-al-Agu.grAfiA 

13.  ha_dA  Agayyid  AgidaN  hal  tadrus  ka_tyraN 

14.  na'am  YanA  dAYimaN  maAs.gUl  mina  al-.s.sabA.h  Yil_A  al-masAY  wa-Yanti 

15.  YanA  Yudarris  al-.t.tib  fy  nafsi  al-AgAmi'aT  wa-YanA  maAs.gUlaT  Yay.daN 

16.  Yanti  .tabybaT  Yaby  huwa  .tabyb  wa-Yummy  hiya  'AlimaT  Yi'lAm 

17.  Yummy  hiya  rabbaT  bayt  ta'mal  kul  al-waqt 

18.  hiya  maAs.gUlaT  ka_tyraN  fy  al-bayt  ,hay_tu  ta'mal 

19.  kam  lu.g.gaT  tatakallam  bi-Yisti_tnAYi  al-lu.g.gaTi  al-'arabiyyaT 

20.  fy  al-.haqyqaT  _talA_taT  waYanti  kam  lu.g.gaTaN  tatakallamyn 

21.  Ya'rif '  AYilataka  min  al-.s.sUraT  al-AgamylaT 

22.  hal  ta'rifyn '  AYilaty  mina  al-rrisAlaTi  al-.t.tawylaT  Yam  al-qa.syraT 

23.  LA  walakin  Ya'rif  zawAga  _hAlatika  AgAlis  Yil_A  al-yamyn 

24.  wa-ha_dihi  hAlaT  wAlidy  AgAlisaT  Yil_A  al-yasAr 


1  WinCalis  is  a  Windows  Based  Computer- Assisted  Language  Instruction  System 
developed  by  Humanities  Computing  Facility,  Duke  University. 
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25.  wa-ha_dA  'am  wAlidatika  wAqif  warAYa  zawAgaT  Ya  hyka  al-wAqifaT  Yay.daN 

26.  na'am  huwa  .dAbi.t  fy  al-AgayAs  mi_tla  Ya_hyh 

27.  mi_tl  Yibn  'ammy  wa-bint  hAly  wa-Ya_hyhA 

28.  kullu  YaqAriby  hum  fy  al-AgayAs  walakin  laysa  ma'ahum  fulUs 

29.  fy  Yay  AgayAs  ya_hdumUn 

30.  fy  al-AgayAs  al-lubnAny  li-Yanna  lubnAn  huwa  wa.tanuhum 

31.  wa-al-wa.tana  al-'araby  huwa  wa.tanunA 

32.  tab'aN  hal  'indaki  waqt  al-YAn 

33. 1A  laysa  'indy  waqt  al-YAn  limA  dA  tasYal 

34.  YasYal  li-Yannany  Yuryd  Yan  Yad'Uki  Yil_A  fmAgAn  qahwaT 

35.  saYaqbal  al-dda'waT  walakin  1A  YaAsrab  al-qahwaT 

36.  rubbamA  taY_hu_dyna  kaYasa  AsAy  wa-kubbAyaTa  mAYiN 

37.  saYaqbal  al-da'waT  walakin  Yintabih  YanA  mutazawwiAgaT  wa-ly  Ya.hsan  raAgul  fy  al-' Alam 

38.  wa-YanA  \'a'zab  wa-laysa  ly  YimraYaT  YanA  Ya_dk_A  Asa_h.s  fy  al-' Alam 

39.  YaAs'ur  Yannaka  Asa_h.suN  wa-.hyduN  wa-mina  al-llAzim  Yan  taAgida  linafsika  rafyqaTaN 
li-.hayAtik 

40.  rubbamA  yA  .gArsUn  qahwaT  wa-AsAy  min  fa.dlika 

41.  liman  al-AsAsAy  wa-liman  al-qahwaT  yA  YustA_d 

42.  al-qahwaT  ly  wa-al-AsAsAy  lahA  wa-ha_dA  al-ba_hAsyAs  laka 
43. 1A  taAsrab  al-AsAsAy  al-ssA_hin  yA  raAgA 

44.  na'am  walakin  'inda  al-fu.tUr  faqa.t  bayna  al-.gadAY  wa-al-'aAsAY  Yufa.d.dil  al-qahwaT 

45.  hal  tu.hib  al-AsAsAy  al-Yamryky  al-mu_tallaAg 

46.  tab'aN  Yu.hibbuhu  ka_tyraN  walakin  fy  fa.sli  al-.s.sayf  li-Yanna  al-.t.taqs  .hAr 

47.  wa-fy  fa.sl  al-AsAsitAY  'indamA  yakUnu  al-Agawwu  bAridaN  tufa.d.dil  al-maAsrUb  al- 
ssA_hin 

48.  bi-al-.t.tab'  wa-fy  al-rraby'  wa-al-_haryfYaAsrab  al-maAsrUbAt  al-ku.hUliyyaT  hA.saTaN 
ma'a  al-Yakl 

49.  YanA  1A  YaAsrabahA  li-Yannany  muslimaT  mA  huwa  naw'u  Yaklika  al-mufa.d.dal 

50.  ta' Amy  al-mufa.d.dal  huwa  mina  al-ma.tba_hi  al-llubnAny  wa-Yu.hibbu  al-mulU_hiyyaT 
ka_tyraN 

51.  wa-lakin  yA  raAgA  al-mulU_hiyyaT  huwa  .tabaquN  min  Ya.sliN  mi.sry 

52.  'aAgyb  Agaddaty  lam  taqul  ly  ha_dA  mun_du  .tufUlaty  wa-hiya  tu.ha.d.diruhA  ly 
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53.  mA  qAlat  laka  ha_dA  rubbamA  li-VannahA  lubnAniyyaT  hal  ta'yAs  Agaddatuka  fy  bayt 
Vahlik 

54.  YawalaN  Agaddaty  laysat  lubnAniyyaT  _tAnyaN  Yintaqalat  Yil_A  baytinA 

55.  ba'da  wafAT  Agaddy  al-ssanaT  al-mA.diyaT 

56.  al-ll_ah  yar.hamuhu  yabdU  Yanna  hiwAyataka  al-mufa.d.dalaT  hiya  al-Yakl 

57.  wa-li'bu  al-waraq  wa-tad_hynu  al-Yar.gylaT  wa-saharu  al-llayl  ma'  al-Ya.sdiqAY  sykAraT 

58.  AsukraN  1A  Yuda_h_hin  YammA  hiwAyAty  fa-hiya  al-mu.tAla'aT  wa-qirAYT  al-AgarydaT 

59.  wa-muAsAhadaT  al-ttalfiziyUn 

60.  YanA  1A  \'u.hibbuhu  walakin  Yufa.d.dil  muAsAhadaT  al-nnAs  wa-al-nna.zar  Yil_A  al- 
.t.taby'aT 

61.  mA_dA  tar_A  fy  al-.t.taby'aT  wa-al-nnAs  .gayr  al-.huzn  wa-al-maAsAkil 

62.  wa-mA_dA  taAgidyn  fy  al-.s.sa.hyfaT  .gayr  mubArayAt  al-rriyyA.daT 

63.  wa-Ya_hbAr  al-'  Alam  wa-al-brAmiAg  al-'ilmiyyaT  wa-al-Yiqti.sAdiyyaT 

64.  wa-al-musalsalAt  al-YiAgtimA'iyyaT 

65.  ya'ny  kul  AsayY  mumil  yaYty  Yil_A  al-' Alam  fy  YA_hir  al-llayl 

66.  wa-Yanta  tu.hib  kul  AsayY  sa.t.hy  yaAgyY  Yil_A  al-_d_dAkiraT  fy  Yawwal  al-yawm 

67.  wa-bidAyaT  al-nnahAr  lina_hruAg  min  ha  dA  al-maw.dU'  yA  rymA 

68.  yabdU  YannanA  1A  nattafiq  'alayhi 

69.  Yanta  'al_A  .sawAb  linatruk  ha_dA  al-maw.dU' 

70.  hal  baytuki  kabyr 

71.  na'am  yataYallaf  min  Yarba'  .guraf  nawm  .gurfaT  AgulUs  .gurfaT  .ta'Am 

72.  kam  .tAbiq  fy  baytiki  yA  .sadyqaty 

73.  yA  .sadyqy  .tAbiqayn  wa-hunAka  .hammAm  fy  kul  .tAbiq  wa-YamAma  al-bayt  .hadyqaT 
.sa.gyraT 

74.  laysa  hunAka  .hadyqaT  biAgAnibi  Asiqaty  wa-lakinnahA  mury.haT  wa-na.zyfaT 

75.  wa-bayty  laysa  wasi_haN  hal  ta_dhab  li-ziyAraT  Yahlika 

76.  'indamA  tasma.h  ly  al-fiir.saT  .gAlibaN  natabAdal  al-rrasAYil  wa-al-bi.tAqAt 

77.  _hilAla  mawsim  al-Ya'yAd  wa-fy  al-munAsabAt 

78.  na'am  hA.saTaN  'yd  al-mylAd  wa-'yd  al-fi.s.h  ,hay_tu  natabAdal  al-hadAyA 

79.  YanA  Yufa.d.dil  al-ssiyA.haT  'al_A  ziyAraT  al-Yahl 

80.  Yu.hibbu  Yan  YanAm  fy  al-funduq 

81.  YanA  1A  Yu.hib  al-fanAdiq  ka_tyraN  hiya  .gAliyaT  wa-laysat  ra  hy.saT 
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82.  wa-lakinnahA  AgamylaT  wa-mury.haT 

83.  Yu.hibbu  Van  Ya.s.hU  mina  al-nnawm  fy  sA'aT  mutaYa_h_hiraT 

84.  Ya'rif  ha_dA  YanA  Ya.s_hU  bAkiraN  kamA  qultu  laki  min  qabal  wa-Yu.sally  .sabA.haN 

85.  ma'  al-Yasaf  YanA  1A  Yu.hAfi.z  'al_A  mawA'yd  al-.s.salAT  wa-lA  'al_A  maw'id  al-.s.siyAm 

86.  fy  al-zzamAn  al-qadym  kAna  al-nnAs  ya_dhabUna  Yil_A  al-AgAmi'  wa-al-kanysaT 

87.  wa-mA  zAla  al-nnAs  fy  al-AsAsarq  yaf  alUna  hadA  fy  ba'.di  \'ayyAmi  al-YusbU' 

88.  wa-lakin  laysa  kamA  fy  al-mA.dy  faqad  Ya.sba.ha  al-nnAs  yahtammUn  bi-al-AsAsu.gal  kamA 
fy  al-.garb 

89.  ma'aka  .haq  al-nnisAY  kAnat  ta_dhab  lil-zzyAraT  YammA  al-\'An  fatuAsAhid  al-ttalfiziyUn 

90.  wa-tastamti'  bi-al-YistimA'  Yil_A  al-rAdyU  wa-al-mUsyq_A  wa-al-ssafar  wa-al-ssyA.haT 

91.  Yi_dA  kAna  al-zzawAgu  .ganiyyaN  YammA  Yi_dA  kAna  faqyraN 

92.  fa-al-baqAY  fy  al-bayt  yakUnu  al-qarAr  al-nnihAYiy 

93.  fy  raYy  ha  dA  huwa  sabab  al-faAsal  al-Yawwal  allady  yusabbib  al-.t.talAq  wa-nihAyaT  al- 
'AYilaT 

94.  'an  Yay  sabab  tatakallam  yA  raAgA 

95.  Yata.haddat  'an  al-\'ibti'Ad  'an  al-ll_ah  wa-al-ddyn  wa-al-_taqAfaT  fy  kul  makAn 

96.  Ya.zunnu  min  .hady_tika  Yannaka  Asa_h.suN  mu.hAfi.z 

97.  wa-mina  al-.garyb  Yannany  lam  Yaltaqy  bi-fatAT  wa-lam  Ya_h.tub  wa-mA  tazawwaAgt 

98.  kayfa  'arifita  wa-fahimta  mA  Yaq.sud  Yinnaka  daky  AgidaN  wa-lakinnaka  .gayr  wAqi'y 

99.  wa-lakinnany  sa'yd  fy  .hayAty  Yu.hibbu  al-masra.h  wa-al-maAsy  'al_A  AsA.t_AYi  al-ba.hr 

100.  hal  tata_dakkar  kitAba  al-kAtibi  al-AsAsahyr  alla  dy  .zahara  fy  maAgallaTi  YAms 

101 .  al-maAgallaT  allaty  .sadarat  fy  al-madynaT  Yam  fy  al-qaryaT 

102.  1A  al-maAgallaT  allaty  .tubi'at  fy  'A.simaT  al-wilAyaT 

1 03 .  wa-allaty  tatakallam  '  an  al-'  alAqaT  wa-al-_hu.tUbaT 

104.  hal  Yanti  fi'laN  mutazawwiAgaT  YanA  fy  al-.haqyqaT  .gayr  mutazawwiAg 

105.  Yintabih  ha_dA  huwa  al-.gArsUn  marraTaN  _tAniyaT  hal  turyd  AsayYaN  YA_har 

106.  na'am  min  fa.dlik  Yar.gab  fy  ba'.d  al-AsAsawrabAY  wa-al-ssala.taT  wa-al-lla.hmaT  wa-al- 
_hu.dAr 

107.  yabdU  Yannaka  AgU' An  YanA  AgU' AnaT  Yay.daN  min  fa.dlik  mil.h  wa-bhAr  wa-samak 

108.  wa-Yuryd  AsawkaT  wa-sikkyn  wa-mil'aqaT  na.zyfaT  ha_dihi  al-mil'aqaT  wasi  haT 

1 09.  .hA.dir  sayyidaty  kayfa  tu.hibbyna  al-ssamak 

110.  maqly  wa-laysa  maAswy  wa-al-_hu.dAr  YuryduhA  maslUqaT 
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111.  wa-nuryd  al- '  aAs  AY  bi-sur'  aT  VaAs '  ur  bi-al-AgU'  i  al-qawy 

112.  wa-hal  tar.gabUna  fy  ba'  .di  al-.halw_A  ba'  da  al-waAgbaT 

113.  Vi  dA  sama.hta  ma'  al-ka_tyr  mina  al-ssukkar  wa-ba'  .d  al-.halyb  lil-qahwaT  AsukraN 

1 14.  VarAgU  \’an  1A  \'akUn  .talabtu  al-ka_tyr  mina  al-.t.ta' Am 

115.  1A  tahtammy  fa-VanA  lastu  .ga.dbAnaN  VanA  sa'yd  li-Yannaki  al-YAn  zamylaty  fy  al-Yakl 

116.  hal  tasta\'aAgiri  al-AsAsiqaT  allaty  taskim  fyhA 

117.  na'  am  waYahly  yusA'  idUnany  fy  daf  ihi  fy  YAhiri  al-AsAsahr 

118.  YanA  Yamluk  bayty  walakin  yaAgib  Yan  Yadfa'  kul  sanaT  qis.taN  min  _tamanihi 

119.  Yi  dan  laysa  'indaki  al-.hurriyaT  al-kAmilaT 

1 20.  lakinnany  YaAstarik  ma'  a  zawAgy  fy  daf  i  qus.ti  al-manzil 

121.  hal '  indaki  Yawl  Ad 

122.  1A  walakin  YanA  wa-zawAgy  nanwy  _dalika  fy  al-mustaqbal  al-qaryb  YinAsAY  al-ll_ah 

123.  al-.hamdu  li-Yallah  Yana  laysa  'indy  masYuwliyyaTa  al-Ya.tfAl  wa-al-madrasaT  wa-al- 
llibs 

124.  libsu  al-Ya.tfAl  Ya.gl_A  min  libasi  al-kibAr  wa-madArisuhum  .sa'baT 

125.  ha  dA  maw.dU'uN  AsAYi'  yadkuruhu  mu'.zami  al-YaAs_hA.s  fy  AgalsAtihim  wa-fy  al- 
nnAdy 

126.  na'am  al-YAn  \'ahtam  bi-AsirA\'  al-malAbis  ly 

127.  wa-YanA  Yu.hibbu  Yan  YaAstary  sayyAraT  .sa.gyraT  fa-darrAAgaty  lam  ta'ud  munAsibaT 

128.  hammy  al-Yawwal  al-YAn  huwa  al-.hu.sUl  'al_A  wa.zyfaT  fy  mustaAsf_A 

129.  mA  huwa  Yi_hti.sA.suki  yA  rymA 

130.  YanA  muta_ha.s.si.saT  fy  .tubbi  al-.t.tawAri_AY  wa-Yu.hibu  \'an  Ya'mal  fy  .gurfaTi  al- 
.t.tawAri_AY 

131.  yaAgib  Yan  Ya'Ud  Yil_A  Asiqqaty  alYAn  .gadaN  huwa  'ydu  al-mylAd 

132.  kul  'yd  wa-Yanta  bi-_hayr 

133.  wa-Yanti  bi-_hayr 

134.  yaAgib  \'allA  Yans_A  maktaba  al-baryd  Yay.daN  wa-al-YiAgtimA'  ma'a  YustA_dy  qabla 
al-'u.tlaT 

135.  saYadfa'  al-.hisAb 

136.  YabadaN  .gayr  mumkin  YanA  huwa  .sA.hib  al-dda'waT 

137.  walakin  yA  rafyqy  YanA  hiya  al-.t.tabybaT  wa-Yanta  huwa  al-ttilmy_d 

138.  wa-YanA  laysa '  indy  al-masYuwliyyaT 

139.  walakin  zawAgaki  wa-rafyqa  .taryqi  .hayAtiki 
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140.  sAmi.hny  yA  raAgA  ka_dabtu  'alayk 

141.  hal  hadA  .hulm  Yam  .haqyqaT  wa-Yanti  lasti  mutazawwiAgaT 

142.  1A  YaqUlu  daYimaN  ha_da  li-YaktaAsif  Yin  kAna  al-raAgul  Ya_hlAquhu '  AliyaT 

143.  mA  ma'  n_A  kul  ha_dA 

144.  Ya.zunnu  \'anna  al-riAgAla  al-yawm  yab.ha_tUA  'ani  al-_d_dahab  wa-al-fi.daT  wa-al- 
mAl 

1 45 .  wa-kay  fa  waAgadtiny 

146.  AsAbuN  mutafawwiq  Agamyl  _daky  mumti'  qawy 

147.  hal  tubAli.gyn 

148.  'al_A  al-Yi.tlAq 

149.  hal  mina  al-mumkin  \'an  naltaqy  marraTaN  Yu_hr_A  fy  Yi.hd_A  al-YayyAm 

150.  \'in  ra.gibta  fy  _dalika  lan  takUn  ha_dihi  al-marraT  alA'a  hyraT  allaty  naltaqy  bihA 

151.  rubbamA  satata'  arrafyna '  al_A  Yahly  wa-Ya.sdiqAYiy 

1 52.  wa-'  al_A  al-'  AdAti  al-llubnAniyyaT 

153.  na'am  hiya  AsabyhaT  ka_tyraN  bi-al-'AdAti  al-mi.sriyyaT 

154.  al-.diyAfaT  fy  lubnAn  maw.dU'  ma_tal 

155.  wa-lA  tansy  al-karam 

The  second  script  is: 

1.  mar.habaN  mA  \'ismuka 

2.  \'ahlaN  YismI  raAgA  wa-Yanti 

3.  \'anA  rlmA  Yanta  .zarlf 

4.  AsukraN  wa-Yanti  AgamIlaT 

5 .  Ya'  rifu  ha  dA  al-Yamr  al-mnhim 

6.  al-ttawA.du'  min  .sifAtiki  Yay.daN 

7.  al-.hamldaTi  na'am 

8.  ta'AlI  nazUr  al-ma.t'am  linaYkul 

9.  _haw_haN  wa-_hu.draTaN  la_dI_daTaN 

10.  wa-la.hmaN  sA_hinaN  wa-.hallbaN 

1 1 .  hadihi  hadiyyaT  _tamInaT 

12.  al-AsAsayY  al-_t_tamln  lilAsAsa_h.s  al-.hablb 

1 3 .  dahabuN  wa-fi.d.daTuN  lAmi' aT 
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14.  Ya.zunnu  YannanI  Asa_h.suN  .hazln 

15.  al-bUYsu  wa-al-fiqru  min  ma'Alim  .day'atl 
16. 1A  Yanta  Asa_h.suN  .gabl 

17.  wa-Yanti  qalllaTu  al-ttah_dlb 

18.  yabdU  YannanA  lan  nattafiq  na_hnu  al-Yi_tnayn 
19. 1A  AsayY  yaAgma'unA  .gayr  al-kalAm  al-fAri.g 

20.  wa-al-.t.ta'Am  al-lla_dl_d 

21.  wa-al-.sidfaTu  al-.garlbaT 

22.  wa-.garAbaTu  al-Ya.twAr 

23.  wa-al-kalAm  al-fAri.g 

24.  'ani  al-zzawAAg  wa-al-.t.talAq 

25.  Ya'rifu '  AYilataka 

26.  min  al-.s.sUraT  al-latl  fl  Agaybl 

27.  LA  min  .hadl  tika  'anhum 

28.  hal  Ya'AgabUki 

29.  yabdU  \'annahum  lu.tafAY 

30.  yA  zamllatl  mA  huwa  lawnuki  al-mufa.d.dal 

31.  rubbamA  al-Yazraq 

32.  hal  al-.t.ta'miyyaT  wa-al-falAfil  nafs  al-AsAsayY 

33.  na'am  al-_t_tAnI  YismuN  lubnAnl  li-.tabaqiN  mi.srl 

34.  fl  al-.haqlqaT  Yu.hibu  Yan  YazUra  mi.sr 

35.  wa-YanA  Yu.hibu  \'an  YazUra  lubnAn  al-'azlz 

36.  saYa_dhab  fl  al-_t_tAnI  min  tiAsrIn  al-Yawwal 

37.  fl  al-'Idi  al-_t_tAli_ti  wa-al-_talA_tIn 

38.  hal  Yinqa.ta'at  'alAqatuka  bi-.sadlqatika  layl  A 

39.  YabadaN  sanatazawwaAg  .gadaN 

40.  mabrUk  yA  .sadlql 

1 .2  Creating  the  Dictionary 

The  dictionary  is  basically  a  file  that  contains  a  list  of  words  and  their  pronunciations. 

The  pronunciations  are  given  in  terms  of  "phonesYY  and  to  each  phone  there  will  correspond  a 
statistical  acoustic  model.  Before  the  USMA  wrote  the  dictionary,  they  had  to  decide  on  a  list  of 
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phones  that  we  wanted  to  model.  Another  Arabic  instructor  at  the  Academy,  LTC.  Terrence 
Potter,  produced  a  list  of  the  most  important  phones  in  the  Arabic  language.  We  classified  the 
phones  by  their  articulatory  features,  because  we  use  these  features  later  in  the  process  of 
clustering  triphones. 


Arabic  Phone 

Articulatory  Features 

A 

stressed  mid  central  vowel 

AA 

stressed  low  front  vowel 

C 

voiced  pharyngeal  fricative 

D 

velarized  voiced  alveolar  stop 

G 

voiced  velar  fricative 

H 

voiceless  pharyngeal  fricative 

I 

stressed  high  front  lax  vowel 

II 

stressed  high  front  tense  vowel 

Q 

voiceless  glottal  stop 

s 

velarized  voiceless  alveolar  fricative 

T 

velarized  voiceless  alveolar  stop 

TH 

velarized  voiced  interdental  fricative 

U 

stressed  high  back  rounded  lax  vowel 

uu 

stressed  high  back  rounded  tense  vowel 

z 

voiced  interdental  fricative 

a 

unstressed  mid  central  vowel 

aa 

unstressed  low  front  vowel 

b 

bilabial  voiced  stop 

d 

voiced  alveolar  stop 

dj 

voiced  alveolar  affricate 

e 

upper  mid  front  tense  vowel 

f 

voiceless  labiodental  fricative 

g 

voiced  velar  stop 

h 

voiceless  glottal  fricative 

i 

unstressed  high  front  lax  vowel 

ii 

unstressed  high  front  tense  vowel 
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Arabic  Phone 

Articulatory  Features 

j 

voiced  palato-alveolar  fricative 

k 

voiceless  velar  stop 

1 

voiced  alveolar  lateral 

m 

voiced  bilabial  nasal 

n 

voiced  alveolar  nasal 

q 

voiceless  uvular  stop 

r 

voiced  alveolar  flap 

s 

voiceless  alveolar  fricative 

sh 

voiceless  palato-alveolar  fricative 

sil 

silence 

sp 

short  pause 

t 

voiceless  alveolar  stop 

th 

voiceless  interdental  fricative 

u 

unstressed  high  back  rounded  lax  vowel 

uu 

unstressed  high  back  rounded  tense  vowel 

w 

voiced  bilabial  approximant 

X 

voiceless  velar  fricative 

y 

voiced  palatal  approximant 

z 

voiced  alveolar  fricative 

Table  3  -  Arabic  Phones  and  Features 

1 .3  Recording  the  Data 

USMA  collected  the  speech  data  at  four  different  sites.  Native  Arabic  speakers  who  were 
learning  English  as  a  second  language  at  the  San  Antonio  branch  of  the  Defense  Language 
Institute  donated  their  speech  to  the  native  corpus.  Mr.  Chouairy  headed  a  team  that  collected 
more  data  from  members  of  a  native  Arabic  speaking  community  near  Toronto  Canada.  Army 
linguists  from  Fort  Bragg  and  the  Marshall  Center  in  Garmisch  Germany  contributed  their 
speech  to  the  corpus  of  nonnative  speech.  Pentium  processor  speed  laptop  computers  running 
Windows  NT  were  used  with  a  brand  name  microphone.  The  sampling  rate  was  set  at  22050  Hz 
and  16-bit  audio  was  used.  The  WinCalis  script  presented  a  line  of  text  in  the  Arabic  script  and 
then  it  played  a  .wav  file  of  Mr.  Chouairy's  rendition  of  the  sentence.  When  ready,  the  informant 
pressed  the  enter  key  and  read  the  prompt.  WinCalis  played  the  recording  so  that  the  informant 
could  either  move  on  to  the  next  prompt  if  the  recording  was  good,  or  re-record  the  prompt  in 
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case  of  a  bad  recording.  Some  of  the  native  informants  read  every  sentence  from  the  WinCalis 
script,  but  most  read  either  the  first  90  sentences  or  the  last  90  sentences.  The  nonnative 
informants  at  least  attempted  to  read  all  40  of  their  prompts.  USMA  ended  up  with 
approximately  5300  .wav  files  from  the  native  informants  and  approximately  1200  files  from  the 
nonnative  informants.  Of  the  native  informants  1 8  were  women  and  40  were  men.  Of  the  22 
normative  informants  8  were  females  and  14  were  males. 

1.4  Creating  the  Transcription  Files 

In  order  to  compute  statistics  for  the  acoustic  models,  each  recording  or  data  file  must 
have  a  corresponding  transcription  file.  These  transcriptions  are  called  label  files  and  they  come 
in  many  formats.  They  can  be  sentence  level,  word-level,  syllable-level,  or  phone-level 
transcriptions.  They  can  be  time  aligned  or  not.  In  our  case  the  goal  was  to  create  simple  phone- 
level  label  files  with  no  time  alignment.  USMA  created  the  phone-level  label  files  in  two  steps. 
First  a  perl  script  was  used  to  convert  the  file  containing  the  prompts  into  a  set  of  word-level 
files.  The  file  with  the  prompts  was  written  with  one  prompt  on  each  line.  The  perl  script  looked 
at  one  line  at  a  time,  creating  a  new  file  for  the  line  and  placing  each  word  in  the  prompt  on  a 
separate  line  in  the  new  file.  So  in  the  case  of  the  native  prompt  script  there  were  155  files  each 
containing  one  word  on  each  line.  Then  the  HTK  label  editor  tool  HLEd  was  used  to  create 
phone  level  files.  Loosely  speaking  HLEd  uses  the  dictionary  to  "look  up"  the  pronunciation  of 
each  word  in  the  word  level  files,  then  it  replaces  the  word  with  the  word's  pronunciation  and 
each  phone  gets  placed  on  its  own  line.  As  an  example,  we  show  here  the  first  sentence  of  the 
native  prompt  script  and  its  phone  level  label  file: 

mar.habaN  mA  Yismuka 

sil 

m  mar.habaN 

A 

r 

H 

a 

b 

a 

n 

sp 

m  mA 
AA 


sp 

Q  '"ismuka" 
I 
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s 

m 

u 

k 

a 

sil 


1.5  Coding  the  Data 

In  speaking,  a  person  encodes  ideas,  represented  by  symbols,  into  acoustic  forms  that  are 
called  phones.  In  listening,  a  person  recovers  the  symbols  that  represent  ideas  from  the  acoustic 
data,  HTK  uses  digital  signal  processing  (dsp)  algorithms  to  model  this  encoding— decoding 
process.  USMA  followed  the  HTKBook  recommendation  of  encoding  the  data  with  Mel 
Frequency  Cepstral  Coefficients  (MFCC)  with  energy  delta-energy  and  acceleration  parameters. 


Step  2.0  Creating  Monophone  HMMs 

Again  USMA  followed  very  closely  the  HTKBook's  recommendations  in  designing  the 
Hidden  Markov  Models  (HMMs).  Models  with  5-states  (only  three  emitting)  and  single  mixture 
gaussian  distributions  were  used. 

2.1  Creating  Flat  Start  Monophones 

Since  USMA  was  modeling  phones  and  since  the  speech  data  consists  of  spoken 
sentences  which  can  be  very  long  sequences  of  phones,  a  major  task  for  the  algorithms  that 
calculate  the  statistics  for  the  models  was  to  find  the  endpoints  in  time  of  the  data  for  a  given 
model.  One  way  to  do  this  is  to  use  a  program  that  enables  a  person  to  view  and  hear  the  speech 
and  to  mark  by  hand  the  endpoints  of  the  phones.  USMA  did  not  accumulate  enough  of  this 
hand-labeled  data;  so  they  decided  to  follow  the  flat  start  training  strategy.  In  this  strategy, 
statistics  are  calculated  over  all  of  the  data  and  these  statistics  are  assigned  to  all  of  the  models. 
After  this  step,  all  of  the  models  have  the  same  statistical  parameters.  Then  subsequent 
algorithms  attempt  to  automatically  time  align  the  data. 

2.2  Fixing  the  Silence  Models 

Small  abnormalities  usually  creep  into  speech.  For  example,  persons  sometimes  make 
false  starts  at  the  beginning  of  words  or  make  unintended  sounds.  To  make  the  speech 
recognizer  tolerant  of  these  abnormalities,  the  HTKBook  introduces  a  short  pause  model  and  a 
modification  to  the  silence  model.  To  train  the  short  pause  (sp)  model  USMA  had  to  go  back 
and  insert  the  sp  symbol  in  the  label  files. 
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Step  3.0  Creating  Tied  State  Triphones 

The  context  of  a  phone,  i.e.  the  phones  that  precede  and  follow  it,  influence  its  acoustic 
realization.  The  goal  is  then  to  model  phones  in  their  contexts.  Triphone  models  are  an  attempt 
to  achieve  this  goal.  The  HTK  training  tools  require  triphone  level  transcriptions.  One  of  the 
drawbacks  of  modeling  triphones  is  the  large  number  of  triphones  that  can  be  formed  from  a 
small  number  of  monophones.  In  this  case,  there  were  44  monophones  and  44A3  =  85184 
triphones.  It  was  not  reasonable  to  expect  to  ever  have  enough  training  data  to  properly  train 
85184  individual  triphones.  The  idea  then  was  to  train  very  similar  triphones  on  the  same  data. 
But  a  question  arose  regarding  how  to  decide  which  triphones  are  similar.  HTK  provides  a 
method  for  clustering  similar  triphones  into  groups  and  for  tying  these  groups  to  the  same 
training  data.  The  clustering  method  uses  the  phonetic  articulatory  features  given  in  the  table 
above  to  classify  the  triphones. 

Step  4.0  Creating  the  Language  Model 

Speech  recognition  systems  need  more  than  just  an  acoustic  model  and  a  dictionary  in 
order  to  produce  text  from  speech.  They  also  require  a  language  model.  USMA  wrote  the 
Arabic  language  model  from  the  English  specifications  for  the  microworld. 

Step  5.0  Recognizer  Evaluation 

To  test  the  Arabic  recognizer,  USMA  used  875  recordings  from  21  nonnative  speakers. 
Each  informant  read  from  a  list  of  40  sentences.  The  standard  sampling  rate  of  22050  Hz  was 
used.  The  testing  was  performed  on  a  Pentium  P2,  300MHz  computer  with  128  MB  RAM, 
running  the  linux  operating  system.  Three  different  lattices  were  used  for  testing. 

1 .  A  lattice  containing  a  list  of  the  40  test  sentences. 

2.  A  much  larger  lattice  into  which  we  embedded  the  40  test  sentences. 

3.  A  word  loop  containing  the  126  words  in  the  script. 

The  tests  were  performed  on  models  with  9  mixture  components.  The  percentage  of 
correct  sentences  recognized  increased  as  we  increased  the  number  of  mixture  components.  The 
delay  between  an  utterance  and  recognition  seems  to  increase  as  the  mixture  components 
increase,  but  USMA  has  not  measured  this  yet.  Performance  also  seems  to  depend  on  the 
architecture. 


Lattice 

Perplexity 

%  Of  Correct 
Sentences 

%  of  correct  words 

40  sentences 

2.826634 

99.66 

99.67 

Microworld  +  40 

sentences 

9.670476 

92.46 

91.76 

Wordloop 

108.095966 

45.14 

83.69 
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Task  8:  Develop  Arabic.  Spanish  and  English  CSR  Components 

For  continuous  speech  recognition  using  HTK,  each  language  requires  the  following 
components: 

•  A  set  of  acoustic  models  (HMMs) 

•  A  dictionary  mapping  the  required  words  into  the  acoustic  models 

•  A  network  specifying  the  recognition  grammar 

•  A  configuration  file 

•  An  application  program 
Acoustic  Models 

The  Arabic  acoustic  model  development  is  described  in  Task  7.  The  English  and  Spanish 
acoustic  models  were  developed  by  Entropic  Cambridge  Research  Laboratory  (ECRL)  and  are 
supplied  with  the  speech  recognition  development  tools. 


The  English  phone  set  (Table  4)  consists  of  41  distinct  speech  phones  plus  the  silence  and 
short  pause  phones. _ _ 


Symbol 

Example 

Symbol 

Example 

aa 

balm 

b 

bet 

aa 

box 

d 

debt 

ah 

but 

k 

cat 

ao 

bought 

P 

pet 

aw 

bout 

t 

tat 

ax 

about 

dh 

that 

ay 

bite 

th 

thin 

eh 

bet 

f 

fan 

er 

bird 

V 

van 

ey 

bait 

s 

sue 

ih 

bit 

sh 

shoe 

iy 

beet 

z 

ZOO 

ow 

boat 

zh 

measure 

oy 

boy 

ch 

cheap 

uh 

book 

jh 

jeep 

uw 

boot 

m 

met 

1 

led 

n 

net 
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Symbol 

Example 

Symbol 

Example 

r 

red 

en 

button 

w 

wed 

ng 

thing 

y 

yet 

sil 

silence 

hh 

hat 

sp 

short  pause 

Table  4  -  English  Phone  Set  (Power,  1997) 

The  Spanish  phone  set  (Table  5)  contains  25  distinct  speech  phones  plus  the  two  silence 
phones. _ _ 


Symbol 

Example 

Symbol 

Example 

a 

casa 

b 

boca 

a 

esta 

d 

dolor 

e 

dije 

g 

gota 

i 

vino 

k 

cama 

0 

hablo 

P 

peso 

u 

sucre 

t 

techo 

w 

giiera 

f 

fino 

y 

ayer 

j 

gel 

l 

lago 

s 

asa 

u 

Have 

z 

azote 

r 

aro 

m 

mano 

IT 

arroz 

n 

nombre 

sil 

silence 

sp 

short  pause 

Table  5  -  Spanish  Phone  Set  (Power,  1997) 

Dictionaries 

The  Arabic  dictionary  development  is  described  in  Task  7.  The  English  and  Spanish 
dictionaries  used  in  MILT  are  subsets  of  the  dictionaries  developed  by  Entropic  Cambridge 
Research  Laboratory  supplied  with  the  speech  recognition  development  tools. 

The  ECRL  supplied  English  and  Spanish  dictionaries  consist  of  90,000  words.  MA&D 
used  the  full  dictionaries  while  developing  the  CSR  components  and  built  smaller  dictionaries  for 
use  in  the  MILT  application.  The  smaller  dictionaries  contain  only  the  words  used  in  the 
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microworld  exercise.  The  English  dictionary  has  290  words,  the  Spanish  has  214  words,  and  the 
Arabic  has  495  words. 

Networks 

Language  models  for  all  languages  were  developed  by  MA&D  using  the  Entropic  HTK 
grapHvite  software.  grapHvite  is  a  developer  kit  for  building  small  to  medium  vocabulary 
speech  recognition  systems.  It  provides  the  means  for  creating  the  recognizer  components  for  a 
given  speech  recognition  task. 

In  particular,  the  Netbuilder  portion  of  graphvite  was  used.  Netbuilder  is  a  visual  tool  for 
rapidly  creating  a  syntax  network  and  dictionary  for  a  task-specific  recognizer.  The  Netbuilder 
provides  tools  for  testing  that  a  network,  once  created,  is  syntactically  correct  and  also  for 
interactively  testing  a  network  with  the  grapHvite  decoder.  In  addition,  the  Netbuilder  allows 
pronunciations  to  be  edited  or  pronunciations  for  words  not  covered  by  any  of  the  supplied 
dictionaries. 

Configuration  Files 

Configuration  files  control  various  parameters  of  the  speech  recognizer.  Each  language 
required  a  unique  configuration  file  to  be  developed.  The  configuration  file  sets  general 
parameters,  directory  defaults,  suffix  defaults  and  system  configurations. 

Application  Program 

The  MILT  microworld  contains  application  code  written  in  C++  to  initialize  the  speech 
recognition  and  process  results.  In  addition,  Java  application  programs  were  developed  to  test 
the  speech  recognition  components  separate  from  the  microworld. 

Task  9:  Expand  the  Arabic  NLP  System 

The  Natural  Language  Processor  (NLP)  embedded  in  MILT  was  developed  by  the 
University  of  Maryland.  The  NLP  works  by  parsing  a  sentence  in  a  bottom-up  process  by 
identifying  the  individual  words  and  then  determining  the  relationships  among  them.  The  main 
components  of  the  parser  are  a  preprocessor,  a  morphological  analyzer,  a  lexicon,  a  syntactic 
parser,  an  error  handling  facility,  and  a  semantic  interpreter.  First,  word  strings  are  submitted  to 
an  interactive  preprocessor  that  identifies  spelling  mistakes.  The  morphological  analyzer  then 
decomposes  the  words  into  subparts  (i.e.,  roots  and  affixes)  based  on  information  in  the  lexicon 
(similar  to  a  dictionary).  Specific  information  is  drawn  from  the  lexicon  about  the  word  subparts 
(e.g.,  whether  it  is  a  verb  or  noun,  singular  or  plural).  The  word  subparts  are  unified  back  into 
their  original  state  and  passed  along  with  the  descriptive  information  from  the  lexicon  to  the 
syntactic  parser. 

Based  on  this  information,  the  parser  tries  to  build  a  structure,  called  a  parse  tree,  which 
reflects  the  appropriate  relationship  among  the  words.  The  parser  moves  words  around  trying  to 
find  all  possible  constructions  that  satisfy  a  minimal  set  of  basic  phrase  structure  rules  (i.e., 
language  universal  abstract  principles).  Also  operating  are  two  interacting  modules.  The  required 
constraints  component  establishes  those  barest  essentials  needed  to  comprise  somewhat 
intelligible  parts  of  a  sentence.  The  second  component,  the  preferred  constraints,  contains  the 
categories  of  grammatical  errors  to  flag  once  the  sentence  or  phrase  has  been  successfully  parsed. 
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In  other  words,  the  parser  at  first  overgenerates  the  number  of  possible  structures  and  then 
imposes  constraints  ("weeds  them  out")  based  on  language  specific  rules.  This  approach  provides 
for  robustness  (i.e.,  parser  does  not  fail  as  it  encounters  each  error)  and  allows  specification  of 
the  grammatical  error  types  of  importance  to  the  instructional  objectives. 

Let  us  illustrate  using  an  example  in  English,  "The  girl  write  the  story."  All  the  words  and 
subparts  are  recognized  and  passed  on  to  the  parser.  "Write"  has  been  identified  as  a  verb  that 
often  has  an  inanimate  object.  The  parser  looks  for  an  inanimate  object  in  the  appropriate 
position  (following  the  verb)  to  complete  the  verb  phrase  branch  of  the  tree,  "write  story".  It 
continues  on  in  this  manner  until  it  completes  the  parse  tree  and  the  error  handler  conveys  to  the 
tutor  that  a  subject-verb  agreement  error  ("girl- write")  was  made.  The  tutor  reads  the  error  data 
and  uses  that  information  to  drive  specific  feedback  to  the  student  and  to  build  the  student  model. 

The  parser  can  even  handle  some  reasonable  sentence  fragments,  so  that  if  the  tutor  asked 
where  the  castle  is  located,  the  student  could  appropriately  reply,  "to  the  north  of  Berlin". 
However,  every  possible  sentence  construction  and  grammatical  error  could  not  be  captured  by 
the  NLP.  Therefore,  determination  of  the  range  of  input  and  error  types  was  made  through 
analysis  of  language  usage  by  the  military  linguists  in  job-specific  environments.  The  most 
common,  problematic,  and  critical  areas  identified  were  those  which  were  included  in  the  NLP 
development 

The  Natural  Language  Processor  Subsystem  can  be  broken  down  into  the  following  parts: 

•  Lexicon 

•  Parser 

•  Lexical  Conceptual  Structure  (LCS) 

•  Semantic  component 

Communication  within  the  NL  system  is  as  follows:  input  is  passed  to  the  lexicon, 
lexicon  items  go  to  the  parser,  the  parser  produces  a  parse  structure  which  is  passed  to  the  LCS 
system,  and  the  LCS  system  passes  information  to  the  semantic  component  for  matching  or  for 
discourse.  Almost  all  communication  is  "point  to  point,"  that  is,  components  directly  call  or 
return  values  to  other  subsystems.  The  one  exception  is  the  communication  of  information  from 
the  parser  to  the  LCS,  which  must  go  through  a  special  C  function  call,  as  there  is  no  direct 
communication  from  Prolog  to  Lisp. 

This  task  involved  expanding  the  Arabic  NLP  system  used  in  MILT  version  1.0  to 
support  the  new  objects  and  actions  in  the  3D  microworld. 
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The  new  entries  are: 


Surface 

Root 

Collocations 

Gloss 

Category 

Primitive 

>Hml 

Hml<ao-N 

Hml<io-N 

'ilaY  Null 

carry 

V 

cause  goloc 

<SEd 

SEd<Iia 

EalY  Ean 
fwq<ao-N 

Null 

climb 

V 

goloc 

<nzl 

nzl<Iia 

nzKIai 

Ean  taHt 

Descend 

V 

go_loc 

>dxl 

dxl<IV 

dxl<Iau 

min  fl  Null 

Enter 

V 

go_loc 

>giq 

glq<Iai 

glq<IV 

Null 

Close 

V+ed 

cause 

goident 

gTy 

gTw<Iau 

gTw<II 

bi 

Cover 

V+ed 

cause 

go_ident 

zHf 

zHf<ao-N 

fl  min  taHt 

Crawl 

V 

goloc 

<$rb 

$rb<Iia 

Null 

Drink 

V 

cause 

goident 

<rm 

rmX<Iai 

Null 

Throw 

V 

cause  go  loc 

>kl 

'kl<Iau  'kl<II 

Null 

Eat 

V+ed 

cause 

go_ident 

qf 

wqf<Iai 

Ean 

Stand 

V 

go_loc 

<jlb 

jlb<Iai 

min  Null 

Bring 

V 

cause  go  loc 

q@mp 

qwm<aaap- 

Nap 

Null 

inventory 

N 

>ET 

ETy<IV 

li 

Give 

V 

cause  go  loc 

>rjE 

q'Edai 

Null 

return 

V 

goloc 

<*hb 

*hb<Iaa 

'ilaY  min  fl 

go 

V 

goloc 

$rq 

$rq<ao-N 

Null 

east 

N 

$ml<aA-N 

Null 

north 

N 

ysr<aA-N 

Null 

left 

N 

jnwb 

jnb<aU-N 

Null 

south 

N 

grb 

grb<ao-N 

Null 

West 

N 
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Surface 

Root 

Collocations 

Gloss 

Category 

Primitive 

ymyn 

ymn<aI-N 

Null 

Right 

N 

Elq 

Elq<Iia 

Elq<V 

EalY 

Hang 

V+ed 

cause 

go_ident 

Swrp 

SWr<uoap- 

Napdu 

Null 

Picture 

N 

Hy@T 

HwT<iA-N 

HwT<=apl- 

Ndu 

Null 

Wall 

N 

smE 

smE<Iia 

smE<II 

Null 

hear 

V 

go_perc 

>Drb 

Drb<Iai 

Null 

hit 

V+ed 

cause 

go_ident 

<msk 

msk<Iai 

Null 

Hold 

V 

cause 

stay_loc 

>dxl 

dxl<IV 

flNull 

insert 

V 

cause  goloc 

qfz 

qfz<Iai 

fwq<ao-N 

Ean  EalY 

Null  'ilaY 

Jump 

V 

go_loc 

>lbT 

lbT<Iai 

Null 

Kick 

V+ed 

cause 

goident 

qrE 

qrE<Iaa 

Null  EalY 

Knock 

V 

goloc 

b@b 

bwb<aa-Ndu 

Null 

Door 

N 

T@wlp 

TAWil 

Null 

Table 

N 

@vb 

wvb<Iai 

fwq<ao-N 

EalY 

Leap 

V 

go_loc 

>trk 

trk<Iau 

EalY  Null 

Leave 

V 

go_loc 

<rfE 

rfE<Iaa 

Null 

Lift 

V 

goloc 

<stmE 

smE<VIII 

'ilaY 

V 

go_perc 

jhz 

jhz<II 

Null 

V 

cause  go  loc 

$ryT 

$rT<aI-Ndu 

Null 

Cassette 

N 

>qfl 

qfl<Iai  qfl<IV 

Null 

Lock 

V+ed 

cause 

goident 
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Surface 

Root 

Collocations 

Gloss 

Category 

Primitive 

>nZr 

nZr<Iau 

Null  'ilaY 
wry<aA-n 
taHtfl 
fwq<ao-N 

Look 

V 

go_perc 

wr@' 

wry<aA-N 

Null 

behind 

P 

Table  6  -  New  Arabic  NLP  Entries 

Task  10:  Deliver  the  software 

Interim  versions  of  the  software  were  delivered  to  ARI  throughout  the  contract.  The  final 
software  delivery  date  was  January  31, 1999. 

Task  11:  Develop  an  Arabic  continuous  speech  recognition  exercise 

A  sample  microworld  exercise  was  developed  for  this  task.  In  the  sample  exercise, 
students  search  a  room  of  an  enemy  prisoner.  Student  commands  result  in  movement  of  objects 
around  the  room.  The  goal  of  the  exercise  is  for  the  student  to  find  a  letter  that  reveals  the 
location  of  the  city  to  which  the  enemy  forces  intend  to  move. 

Directions:  Our  intelligence  reports  tell  us  that  there  is  to  be  an  attack  launched  soon  and 
that  information  about  this  attack  is  located  in  this  room.  You  are  to  search  this  room  to  find  out 
where  the  attack  is  going  to  occur.  You  may  have  to  look  in  objects  (cabinets,  boxes,  etc.)  to 
find  the  documents  describing  the  attack. 

Question:  Where  are  the  troops  moving? 

Answer:  Srqnd 

List  of  Objects: 

file  cabinet  -  visible 
table  -  visible 
waste  basket  -  visible 
box  -  visible 

briefcase  -  visible  on  floor 
lamp  -  visible  on  table 
radio  -  visible  on  table 
envelope  -  in  briefcase 
letter  -  in  envelope 
gun  -  in  file  cabinet 
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map  -  in  box 

newspaper  -  in  file  cabinet 
book  -  in  briefcase 

Radio:  When  student  turns  on  the  radio,  he  will  hear  the  following  : 

"Good  Morning  Halabja!  The  following  song  is  dedicated  to  the  brave  men  who  are 
fighting  for  our  just  cause  near  the  great  river.  We  pray  God  that  you  are  safe  and  strong." 

Newspaper: 

headline  reads: 

Troop  Morale  is  High 

text: 

The  soldiers  are  enjoying  a  sense  of  great  confidence  because  of  their 
excellent  training  in  the  use  of  modem  weapons  and  also  because  they  have  trust  in  our  Air  Force 
and  its  capability  to  clear  our  skies  of  ennui  planes  and  provide  air  coverage  for  our  ground 
forces. 

Book: 

Title: 

Infantry  Code  Manual 
text  page  1 : 

January  -  February  1999 
Sensitive  Classified  Information 
text  page  2: 


frequency 

call  sign 

100.3 

CEOI 

97.5 

XR29 

85.0 

J2Q 

Envelope: 

Leila  Sadawi 
Arab  Republic 
Najran 

1441  al-Hadeekah  St 

Letter: 

My  Beloved  Leila: 
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I  am  writing  to  you  from  my  position  on  the  border.  I  am  very  well.  Everything 
here  is  in  a  state  of  readiness  for  war.  When  I  got  here  I  helped  dig  trenches  and  carrying 
food-supplies.  Then  we  resumed  our  daily  training.  Every  now  and  then,  I  go  on 
reconnaissance  missions  in  Halabja. 

I  miss  you  so  much  and  want  this  war  to  be  over  so  I  can  see  you.  I  do  not  know 
when  this  will  happen,  but  it  is  likely  that  the  war  will  be  a  long  one  because  we  are  about 
to  move  towards  Srqand.  This  will  make  the  war  range  on  a  larger  scale  and  we  may  also 
make  more  movements. 

I  don't  want  you  to  worry  too  much  about  me.  My  hope  is  so  great  in  being  able 
to  see  you  again  after  the  war  is  over.  Look  after  yourself  and  do  not  forget  me  or  forget 
writing  to  me.  It  is  unfortunate  that  your  letters  reach  me  late,  but  do  not  keep  me 
bereaved  of  your  sweet  words. 

Love, 

Saaber 


Task  12:  Prepare  system  documentation  and  user  help 

Documentation  prepared  for  MILT  2.0  under  this  task  include  the  “MILT  2.0  Software 
User’s  Manual,”  the  author  help  file,  the  student  help  file,  the  main  help  file,  the  3D  microworld 
authoring  help  file,  a  author  tutorial,  grammar  references  for  Arabic  and  Spanish,  and  the  final 
report. 

Task  13:  Complete  monthly  progress  reports 

MA&D  provided  monthly  status  reports  during  the  project  on  the  15th  of  each  month. 

The  reports  each  contained  the  following  information: 

•  Progress  during  the  month  including  a  summary  of  completed  and  on-going  activities 

•  Progress  projected  for  the  following  month 

•  Problems  encountered  or  anticipated 

•  Costs,  both  direct  and  indirect  incurred 

Task  14:  Complete  final  report 

This  document  is  the  result  of  this  task. 

Summary 

This  report  documented  the  design  and  development  of  an  authorable,  speech  recognition 
enhanced,  three-dimensional  microworld.  The  speech  recognition  incorporated  into  MILT  is 
corpus-based,  continuous  and  speaker-independent.  Major  components  of  the  developed  system 
are  continuous  speech  recognition  components  for  English,  Arabic,  and  Spanish,  an  authorable 
3D  microworld,  and  an  expanded  Arabic  natural  language  processing  (NLP)  system. 
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Appendix  A 

ArabTeX  Conventions 

ArabTeX  is  a  package  extending  the  capabilities  of  TeX/LaTeX  to  generate  the  arabic 
writing  from  an  ASCII  transliteration  for  texts  in  several  languages  using  the  arabic  script. 

Standard  arabic  and  persian  characters: 


b  bah 
d  dal 
.s  ssad 
f  fah 
h  hah 
'  hamza 
t  tah 
_d  dhal 
.d  ddad 

q  qaf 

w  waw 
N  tanween 
_t  thah 
r  rah 
.t  ttah 
k  kaf 
y  yah 

Y  alifmaqsoura 
Ag  geem 
z  zay 
.z  tthah 
1  lam 
g  gaf 

A  alif  maqsoura 
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.hhhah 


s  seen 
'ain 

m  meem 
p  pah 

T  tahmarbouta 
_h  khah 
As  sheen 
•g  ghain 
n  noon 
v  vah 

W  waw  (see  below) 

Additional  characters  generally  available: 


c  hhah  with  hamza 

Ac  gim  with  three  dots  (below) 

,c  khah  with  three  dots  (above) 

Az  zay  with  three  dots  (above) 

~n  kaf  with  three  dots  (Ottoman) 

~1  law  with  a  bow  accent  (Kurdish) 
~r  rah  with  two  bows  (Kurdish) 


